Using the filesystem as a searchable database.
Rocketstore is a Python library for data storage. It provides an interface for storing and retrieving data in various formats.
ORIGINAL / NODE VERSION: https://github.com/Paragi/rocket-store/
You can download from PyPi repository: https://pypi.org/project/Rocket-Store/
Rocket-Store was made to replace a more complex database, in a setting that required a low footprint and high performance.
Rocket-Store is intended to store and retrieve records/documents, organized in collections, using a key.
Terms used:
- Collection: name of a collections of records. (Like an SQL table)
- Record: the data store. (Like an SQL row)
- Data storage area: area/directory where collections are stored. (Like SQL data base)
- Key: every record has exactly one unique key, which is the same as a file name (same restrictions) and the same wildcards used in searches.
Compare Rocket-Store, SQL and file system terms:
Rocket-Store | SQL | File system |
---|---|---|
storage area | database | data directory root |
collection | table | directory |
key | key | file name |
record | row | file |
- Support for file locking.
- Support for creating data storage directories.
- Support for adding auto incrementing sequences and GUIDs to keys.
To use Rocketstore, you must first import the library:
from Rocketstore import Rocketstore
rs = Rocketstore()
usage of constants:
#method 1:
rs = Rocketstore()
rs.post(..., rs._FORMAT_JSON)
#or
rs.post(..., Rocketstore._FORMAT_JSON)
rs.post(collection="delete_fodders1", key="1", record={"some":"json input"}, flags=Rocketstore._FORMAT_JSON)
# or
rs.post("delete_fodders1", "1", {"some":"json input"}, Rocketstore._FORMAT_JSON)
Stores a record in a collection identified by a unique key
Collection name to contain the records.
Key uniquely identifying the record
No path separators or wildcards etc. are allowed in collection names and keys. Illigal charakters are silently striped off.
Content Data input to store
Options
- _ADD_AUTO_INC: Add an auto incremented sequence to the beginning of the key
- _ADD_GUID: Add a Globally Unique IDentifier to the key
Returns an associative array containing the result of the operation:
- count : number of records affected (1 on succes)
- key: string containing the actual key used
If the key already exists, the record will be replaced.
If no key is given, an auto-incremented sequence is used as key.
If the function fails for any reason, an error is thrown.
Find and retrieve records, in a collection.
rs.get(collection="delete_fodders1")
# or
rs.get("delete_fodders1")
# Get wildcard
rs.get("delete_*")
# Get wildcard in collection
rs.get("*")
# Get wildcard in key (see sample in Samples/queries.py)
rs.get("delete_fodders1", "*")
# Get only auto incremented rows (see sample in Samples/queries.py)
rs.get("delete_fodders1", "?")
# get only keys
rs.get("delete_fodders1", "*", Rocketstore._KEYS)
Collection to search. If no collection name is given, get will return a list of data base assets: collections and sequences etc.
Key to search for. Can be mixed with wildcards '*' and '?'. An undefined or empty key is the equivalent of '*'
Options:
- _ORDER : Results returned are ordered alphabetically ascending.
- _ORDER_DESC : Results returned are ordered alphabetically descending.
- _KEYS : Return keys only (no records)
- _COUNT : Return record count only
Return an array of
- count : number of records affected
- key : array of keys
- result : array of records
NB: wildcards are very expensive on large datasets with most filesystems. (on a regular PC with +10^7 records in the collection, it might take up to a second to retreive one record, whereas one might retrieve up to 100.000 records with an exact key match)
Delete one or more records, whos key match.
# Delete database
rs.delete()
# Delete collection with content
rs.delete("delete_fodders1")
# Delete wild collection
rs.delete("delete_*")
# Delete exact key
rs.delete("delete_fodders1", "1")
Collection to search. If no collection is given, THE WHOLE DATA BASE IS DELETED!
Key to search for. Can be mixed with wildcards '*' and '?'. If no key is given, THE ENTIRE COLLECTION INCLUDING SEQUENCES IS DELETED!
Return an array of
- count : number of records or collections affected
Can be called at any time to change the configuration values of the initialized instance
Options:
- data_storage_area: The directory where the database resides. The default is to use a subdirectory to the temporary directory provided by the operating system. If that doesn't work, the DOCUMENT_ROOT directory is used.
- data_format: Specify which format the records are stored in. Values are: _FORMAT_NATIVE - default. and RS_FORMAT_JSON - Use JSON data format.
rs.options(data_format=Rocketstore._FORMAT_JSON)
# or
rs.options(**{
"data_format": Rocketstore._FORMAT_JSON,
...
})
Another option is to add a GUID to the key. The GUID is a combination of a timestamp and a random sequence, formatet in accordance to RFC 4122 (Valid but slightly less random)
If ID's are generated more than 1 millisecond apart, they are 100% unique. If two ID's are generated at shorter intervals, the likelyhod of collission is up to 1 of 10^15.
Contributions are welcome. Please open an issue to discuss what you would like to change.
- https://packaging.python.org/en/latest/tutorials/packaging-projects/
- https://realpython.com/pypi-publish-python-package/
Local:
python -m pip install build twine
python3 -m build
twine check dist/*
twine upload dist/*
Live: No need do nothing GitHub have Workflow action its publish auto
In root folder run create virtual env virtualenv ./venv && . ./venv/bin/activate
and run pip install -e .