Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(core): Don't index part keys with invalid schema #1870

Merged
merged 1 commit into from
Oct 16, 2024

Conversation

rfairfax
Copy link
Contributor

When bootstrapping the raw index we skip over tracking items with invalid schemas, signified by partId = -1. However, today we still index them which can create query errors later on like the following:

java.lang.IllegalStateException: This shouldn't happen since every document should have a partIdDv
	at filodb.core.memstore.PartIdCollector.collect(PartKeyLuceneIndex.scala:963)
	at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:305)
	at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:247)
	at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:38)
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:776)
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:551)
	at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1(PartKeyLuceneIndex.scala:635)
	at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1$adapted(PartKeyLuceneIndex.scala:635)
	at filodb.core.memstore.PartKeyLuceneIndex.withNewSearcher(PartKeyLuceneIndex.scala:279)
	at filodb.core.memstore.PartKeyLuceneIndex.searchFromFilters(PartKeyLuceneIndex.scala:635)
	at filodb.core.memstore.PartKeyLuceneIndex.partIdsFromFilters(PartKeyLuceneIndex.scala:591)
	at filodb.core.memstore.TimeSeriesShard.labelValuesWithFilters(TimeSeriesShard.scala:1782)

This fix ensures that we don't index part keys we skip during bootstrap so that the in memory shard and index are consistent with each other.

Pull Request checklist

  • The commit(s) message(s) follows the contribution guidelines ?
  • Tests for the changes have been added (for bug fixes / features) ?
  • Docs have been added / updated (for bug fixes / features) ?

Current behavior : (link exiting issues here : https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests)

Invalid schema items end up in index

New behavior :

Invalid schema items are skipped

When bootstrapping the raw index we skip over tracking items with invalid schemas,
signified by partId = -1.  However, today we still index them which can create query
errors later on like the following:

```
java.lang.IllegalStateException: This shouldn't happen since every document should have a partIdDv
	at filodb.core.memstore.PartIdCollector.collect(PartKeyLuceneIndex.scala:963)
	at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:305)
	at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:247)
	at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:38)
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:776)
	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:551)
	at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1(PartKeyLuceneIndex.scala:635)
	at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1$adapted(PartKeyLuceneIndex.scala:635)
	at filodb.core.memstore.PartKeyLuceneIndex.withNewSearcher(PartKeyLuceneIndex.scala:279)
	at filodb.core.memstore.PartKeyLuceneIndex.searchFromFilters(PartKeyLuceneIndex.scala:635)
	at filodb.core.memstore.PartKeyLuceneIndex.partIdsFromFilters(PartKeyLuceneIndex.scala:591)
	at filodb.core.memstore.TimeSeriesShard.labelValuesWithFilters(TimeSeriesShard.scala:1782)
```

This fix ensures that we don't index part keys we skip during bootstrap so that the in memory
shard and index are consistent with each other.
Copy link
Contributor

@amolnayak311 amolnayak311 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rfairfax

@rfairfax rfairfax merged commit 84f7ade into filodb:develop Oct 16, 2024
1 check passed
@rfairfax rfairfax deleted the rfairfax/invalid_schema branch October 16, 2024 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants