Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Make IndexAPI call the database #2201

Conversation

ukstv
Copy link
Contributor

@ukstv ukstv commented Jun 1, 2022

Potential controversies:

  • We might have indexing database backend not configured. It makes the queryIndex method return an empty page. We also log a warning for every query.
  • Use Repository::streamState to load StreamState for StreamID. We assume that a state store always contains StreamState for an indexed stream.

Base branch is not development to reduce mental overhead for reviewing.

@linear
Copy link

linear bot commented Jun 1, 2022

NET-1522 Implement IndexAPI for Ceramic Core

Expose created DatabaseIndexAPI

@ukstv ukstv requested review from stbrody and willex June 1, 2022 16:18
@@ -205,7 +205,6 @@ export class Ceramic implements CeramicApi {
this._networkOptions = params.networkOptions
this._loadOptsOverride = params.loadOptsOverride
this._indexing = modules.indexing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment on the previous PR: now that we pass the indexing into the IndexApi, do we need to store a reference here in the Ceramic class, or can we just keep the reference to this.index and have it be responsible for managing the DatabaseIndexApi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better leave the discussion there, I guess.

if (this.databaseIndexApi) {
const page = await this.databaseIndexApi.page(query)
const streamStates = await Promise.all(
page.entries.map((streamId) => this.repository.streamState(streamId))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repository.streamState() doesn't use the loading queue or add the state to the cache. Should we be using Repository.get() instead?

async get(streamId: StreamID): Promise<RunningState | undefined> {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was Repository::load first :D before I realised that the users will most probably load hundreds of streams at once. In my opinion, there is a pretty high chance a Ceramic node will never see some stream except in the indexing response. In this case, why make its life harder with loading queues, cache invalidation and so on?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I more buy the argument that we should skip the loading queue since we know we won't be generating pubsub queries so the need to rate limit is less. I'm a little less convinced that we should skip caching streams that are loaded via queries. I do see the concern though that one large query could blow away your entire cache in one fell swoop. So I'm okay with going with this for now and we can revisit in the future if this causes issues.

async queryIndex(query: BaseQuery & Pagination): Promise<Page<StreamState>> {
if (this.databaseIndexApi) {
const page = await this.databaseIndexApi.page(query)
const streamStates = await Promise.all(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about us hammering S3 with hundreds of simultaneous requests. I think we might want to break this down into smaller batches and/or rate limit how fast we hammer the state store with these requests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is a responsibility of S3StateStore then, and not this thing. Created an issue for that.

@ukstv ukstv force-pushed the feature/net-1522-implement-indexapi-for-ceramic-core branch from e8947c9 to 2d9b84e Compare June 2, 2022 16:53
packages/core/src/indexing/local-index-api.ts Show resolved Hide resolved
if (this.databaseIndexApi) {
const page = await this.databaseIndexApi.page(query)
const streamStates = await Promise.all(
page.entries.map((streamId) => this.repository.streamState(streamId))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I more buy the argument that we should skip the loading queue since we know we won't be generating pubsub queries so the need to rate limit is less. I'm a little less convinced that we should skip caching streams that are loaded via queries. I do see the concern though that one large query could blow away your entire cache in one fell swoop. So I'm okay with going with this for now and we can revisit in the future if this causes issues.

Copy link
Contributor

@stbrody stbrody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ukstv ukstv self-assigned this Jun 6, 2022
@ukstv ukstv force-pushed the feature/net-1522-implement-indexapi-for-ceramic-core branch from 8cd8182 to aa6aa91 Compare June 6, 2022 08:49
@ukstv ukstv merged commit 0c8aad4 into feature/net-1521-expose-indexingapi-as-part-of-ceramicapi Jun 6, 2022
@ukstv ukstv deleted the feature/net-1522-implement-indexapi-for-ceramic-core branch June 6, 2022 08:52
ukstv added a commit that referenced this pull request Jun 6, 2022
…ent (#2200)

* feat: add dummy implementation of IndexClientApi to core and http-client

* feat: Rename IndexApi

* chore: Different version of the index initialization

* feat: Make IndexAPI call the database (#2201)

* feat: Make IndexAPI call the database

* feat: use Repository::streamState to get StreamState by StreamID

* chore: simpler preparation of queryIndex result

* Update packages/core/src/indexing/local-index-api.ts

Co-authored-by: Spencer T Brody <[email protected]>

* chore: After-rebase fixes

Co-authored-by: Spencer T Brody <[email protected]>

* chore: After-merge fixes

Co-authored-by: Spencer T Brody <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants