Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds extraction links for moonmining observers #319
base: master
Are you sure you want to change the base?
Adds extraction links for moonmining observers #319
Changes from 3 commits
cbf8187
d814ea5
445821f
29ca26d
fa30811
71f3a11
fdfde79
ca69bb4
699d1ff
d541eeb
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since corporation mining data can also track non corporate character, it would be better to make this relationship to
UniverseName
where primary isentity_id
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added separate link to universe_name since both should be present in my opinion. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem you run into is that by splitting into two relationships, you now need to decide which one to call at runtime. What is the application you are using the relationship for? For name resolution, you are much better-suited sourcing from universe_name as there will be more entries. For linking back to users as I suspect you are doing, this raises an interesting question... As we cannot link back to the character from universe_name without doing individual DB lookups this would be quite expensive.. Makes me have thoughts about polymorphic relationships from UniverseName back to a relevant class (but that is not in scope of this PR). WIll defer to @warlof on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about a poly for a while, but there are few issues using it :
you mostly get the primary question in your concern, what will be the purpose of that relation ? And there, the answer is : he's looking for a way to group entries per user - which is not possible on behalf
UniverseName
.against having both
UniverseName
andCharacterInfo
inside same model, however, we might addCharacterInfo
,AllianceInfo
andCorporationInfo
relationships inUniverseName
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the idea is to group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a problematic foreign key. I don't know yet how to handle that though.
Spawning a dummy extraction sounds wrong to me - but avoiding data to be recorded isn't good either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We basically have two options here, the first is to create a key to the extraction_id as has been done. The second is to base the relationship in the extractions model which will grab all mining data with that structure_id and between the known datetimes of those extractions. In my dev env I had done the latter.. It hurt performance slightly, but kept the DB free of the extra key. The benefit of the relationship though is you can be much more sure you have the correct extraction as it is based on the dates reported by eve.
The problem with the key that needs to be addressed is the interval used to find the extraction. It should not be plus-minus 5 days. The reason being is you may get overlapping extractions with this range. The minimum drill time is 6 days, which can be started as soon as the current chunk is fractured. An unrigged structure will have an asteroid field that lasts 48h. Combined with the
last_updated
only being accurate to a day, this gives you a separation of only 3 days (48h plus 1d overflow). However, in a T2 rigged structure these fields can be mined for an extra 100% of time, totalling 96 hours. This means when you add the day for overflow, your separation is only 1 day.The other issue, is that if the extraction does not get populated in the DB prior to the ObserverData then the extraction_id resolution occurring in the boot of the ObserverData model will fail to find an extraction_id and will never try again to populate this, losing the link in the data. The only way then to resolve that would be to add a correction job that runs periodically and attaches all orphaned data to extraction_ids if possible. This in my opinion would be clunky.
My proposal instead would be to add an index to last_updated, and then create the relationship in the Extraction to the ObserverData based on matching structure_id and use the days of the extraction key dates (plus 1 on the end) as bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it tries to find date by chunk_arrival_time BETWEEN (6 .. last_updated .. 6)
since 6 is the minimal, 5 would be safest I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that last_updated will continue to update right up until the mining stops. This means it will slowly travel from the
chunk_arrival_time
all the way tonatural_decay_time
+ 24h. And you cannot guarantee at what point in that cycle we will first receive that data.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, what we're sure about is :
Early I said : corporation mining has not be implemented because it's not robust yet - but it appears none of you put this in consideration and mostly assume our modeling is right.
What if the observer is falsy recorded in our current architecture ?
Let me rephrase : observer is an event - as per ESI, it has the following mandatory fields :
We already know that
observer_id
is the structure to which observer is attached to.Now, what if we should record data as soon as we don't have a matching pair of
observer_id
,last_updated
?This would imply an addition of a new primary AI field called
id
and a refactor of the observer job in order to spawn a new entry as soon as we don't know about the pairobserver_id
andlast_updated
?Based on this, we can notice all
last_updated
field for eithercorporation_industry_mining_observers
orcorporation_industry_mining_observer_data
are always formatting likeYYYY-mm-dd 00:00:00
- is it an error from our side ? at least, since both match, we can properly linkobserver_data
toobserver_event
(based onobserver_id
andlast_updated
fields).So now, last question, what is the relation between observer and extraction ? Is there something else than the structure ID ? Like, we get an event every time an extraction cycle is pass (start, chunk arrival, explode/natural decay time) ?
Since I'm non longer involving in a corporation, I will not be able to check those upper theory. But, I think it could be interesting to apply those few refactor on a dev env, collect data. Beeing able to validate all upper could help a lot to achieve what you're trying to do without all those complexe things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My corporation is having one moon per day with max cycle. So far this code works good.
Big problem as you said, observerID is just structureID, so to calculate actual extraction you'd need to match observerID+lastUpdated. Since we are storing now also extractionID it is easier to match single moon extraction to calculate actually what was happened in last extraction cycle, and not by lastUpdated, but by whole three extraction days it was available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@warlof You are right. We have not discussed if the structure we have in the DB is the best structure. What we have at the moment mirrors the results we get back from ESI. This data structure is more flexible for possible future expansion as hinted at by the
/corporation/{corporation_id}/mining/observers/
endpoint. This results in thecorporation_industry_mining_observers
table with structure as below:This table is the one @vo1 wanted to deprecate as at the moment it serves little purpose currently. The reason it is of little use for the application of this PR is due to the fact that currently, the only
observer_type
is a structure. That said I would not rule out in the future more mining observers being added based on this. Also, I have not confirmed this, but I have a theory based on the description of the endpoint that a mining structure placed on a moon in such a way that it could start an extraction, but has not yet fired the moon drill, will be present in this return. If correct, this is a good data point to have for the reason of knowing which structures are situated correctly on moon beacons vs just in space (yes I have seen structures not on moons still fitted with moon drills).@vo1 To address your last post see below.
Congrats on running so many moons, but this is not the issue I presented to you above. The issue with the 6-day interval you have chosen is when the moon drills are set to minimum cycle, with the rig for maximum asteroid field life. Then you run into problems with a 6-day interval overlapping across multiple
extraction_id
. You can easily differentiate between different moons with theobserver_id
so the fact you have overlapping cycles across multiple moons is no problem. The issue is the close spacing of cycles on the same moon.The fact that
observer_id
is just thestructure_id
holds, for now, is true. However, based on CCP's API we should be ready for that not to always be the case.Yes, with the
extraction_id
stored it is much easier to match the details, but you are not guaranteeing storage of theextraction_id
in this PR. If theCorporationIndustryMiningExtraction
model does not have a record of the extraction when theObserverDetails
job runs then theextraction_id
will not be populated and will be forever orphaned. Either a periodic job needs to be created which will re-attach the orphaned data, or the relationship based on thelast_updated
field used as this does not require data hydration to function.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I lowered to 5 days since 3d I think is kinda unsafe.
Grouping by last_updated with some virtual column that will make it a single extraction is simply same as grouping them by extraction_id with this 5d range, except slower.
Post-processing with a cronjob I think also might have similar problem.