Skip to content
This repository has been archived by the owner on May 13, 2020. It is now read-only.

Releases: mshuaic/Blockchain_med

Baseline7 & 8

24 Aug 02:53
Compare
Choose a tag to compare
  • Baseline8
    Modifies and query algorithms
    Now and query queries one stream, then it sorts the result in memory.
    How to pick the stream?
    Priority: the smaller result set has higher priority
    For example: User queries (timestamp AND id). Timestamp stream
    normally returns only one result and ID stream returns four results,
    so Timestamp stream has higher priority than ID stream. We should query
    Timestamp stream, get the result and sort it in memory.

    baseline8

  • Baseline7
    Baseline6 + database normalization

    baseline7

Baseline 5 & 6

01 Aug 21:35
Compare
Choose a tag to compare
  • Baseline6
    New feature found
    In json-rpc call, we can batch multiple calls into an array and send it to server once.
    Now instead of sending multiple calls which bottleneck the performance (latency
    = # of calls * network latency), we batch multiple calls and sent it only
    once. The server will desterilize calls locally, hence the former bottleneck
    is not a problem anymore.

    Based on this idea, we reconstruct transaction structure

    • Before: We write the same data into multiple streams in one transaction.
      Every vout of the transaction contains the same data. The data is duplicated
      many times in that transaction.
    • After: Write the actual date into one stream, and write empty data to the
      other streams in the same transaction. All those stream items will share the
      same txid so we can use getstreamitem to retrieve the item from the first
      stream after finding that txid from listing the other stream items. The result
      often contains multiple txid, and we use batch call to query the actual data.
      Note: this approach is similar to unique ID approach. See 06/28/2018

    Now the usage of storage has improved.
    baseline6

  • Baseline5
    Build multi-level indexing structure for timestamp on blockchain

    • 1-level: timestamp gap = 10000. In this level, one key stores about ten
      values.
    • 2-level: timestamp gap = 1-level gap * 100. In this level, one key stores
      100 1-level key-values records.
    • n-level: timestamp gap = (n-1) level gap * 100. One key stores 100 (n-1)-
      level key-values records.

    The number of level is determined by the range of timestamp.(NEED TO DO MORE
    TESTES LATER)
    Now, the range query gets a batch of records at once.
    4v5

baseline4

18 Jul 02:03
Compare
Choose a tag to compare
  • Baseline4 (database normalization)
    Most lines' ref-ID are refer back to the same original ID which means those
    lines' User and Resource are the same. For this reason, we can exclude User
    and Resource in transaction. When querying User and Resource, the baseline4
    first get its original Node+ID, and then using Node+ID to query additional
    result and union them.
    • In memory solution: First query Node, then extracting the lines that
      has matching ID.
    • On Blockchain solution: Create an additional Node+Ref-ID stream whose
      key is Node+Ref-ID and value is log record. We can query Node+ID =
      Node+Ref-ID to get the additional result.

Obviously, the former solution use more memory (we don't have the ability to benchmark the memory yet), and latter solution requires more space.

Baseline3

06 Jul 03:11
Compare
Choose a tag to compare
  • (Done) Baseline3
    • Using multiple streams to insert data to Blockchain:
      one line in log data -> one transaction, and the transaction will add data to
      multiple streams(tables) atomically. It reduces the number of transactions to 1/7
      (7 is the number of attributes). As a result, the insertion time and storage are
      both decreased dramatically. See figure blow.

    • Build an indexing solution for timestamp:

      • Retrieve all timestamps from Blockchain
      • Build a sorted list for the timestamps

      Now, we are able to do fast range query, however, this solution builds an index
      table in memory which is prohibited by the competition
      . We just use it as
      an experiment for now.

  • (Done) Sorting function
    We are able to sort the result of querying. The test only use 400 records, so
    it does not show any significant difference in time. We will increase the number
    of records later.

v0.2.1

28 Jun 19:01
Compare
Choose a tag to compare
  • optimized baseline2 algorithm
  • added plot support

Baseline Implementation v0.2

22 Jun 00:55
Compare
Choose a tag to compare
  • added hash pointer solution

baseline implementation v0.1.1

21 Jun 19:57
Compare
Choose a tag to compare
  • clean up the code
  • reconstruction the file structure

baseline implementation

20 Jun 20:30
Compare
Choose a tag to compare

Baseline implementation version 0.1

  • Insertion: insert n (n = number of attribute) times copy to blockchain, using attribute as key and entire line as value
  • Range query: query from start time, and increase timestamp by 1 every time till end time. The total number of query needed is (start - end)
  • And operation: query using single attribute and do AND operation locally