Skip to content
This repository has been archived by the owner on Jan 31, 2020. It is now read-only.

sv_squaredancer_0.1 error (workflow related, "instance" violates foreign key constraint "instance_peer_instance_id_fkey") #52

Closed
gatoravi opened this issue Dec 5, 2013 · 10 comments
Assignees
Labels

Comments

@gatoravi
Copy link
Contributor

gatoravi commented Dec 5, 2013

The somatic variation build fails with this error in the SquareDancer step,

2013-12-03 11:49:46-0600 blade16-4-16: DBD::Pg::st execute failed: ERROR:  insert or update on table "instance" violates foreign key constraint "instance_peer_instance_id_fkey"
2013-12-03 11:49:46-0600 blade16-4-16: DETAIL:  Key (peer_instance_id)=(3af6011c5c4311e3a5b4337be30ae95e) is not present in table "instance". at /opt/gms/GD0T679/sw/ur/lib/UR/DBI.pm line 888.
2013-12-03 11:49:46-0600 blade16-4-16: 2013/12/03 11:49:46 Workflow::DataSource::InstanceSchema id(Workflow::DataSource::InstanceSchema): Workflow::DataSource::InstanceSchema: Error executing SQL:
2013-12-03 11:49:46-0600 blade16-4-16:  INSERT INTO workflow.INSTANCE (peer_instance_id,parent_instance_id,parent_execution_id,parallel_index,output_stored,name,intention,input_stored,current_execution_id,WORKFLOW_PLAN_ID,WORKFLOW_INSTANCE_ID) VALUES (?,?,?,?,?,?,?,?,?,?,?)

This error needs to be replicated in a fresh install of the sGMS.

@ghost ghost assigned gatoravi Jan 3, 2014
@malachig
Copy link
Collaborator

We are getting the same errors during 'sv_breakdancer' and 'indel pindel' steps of somatic-variation workflows.

Something fairly fundamental is going wrong here... It is occurs consistently in these steps for any somatic-variation run within the standalone GMS

@malachig
Copy link
Collaborator

Maybe this problem is related to the workflow database schema tables that were migrated from Oracle to Postgres for the purposes of releasing the standalone GMS... In particular the error is on a foreign key constraint in the 'instance' table of the workflow schema.

This foreign key constraint is set up here in the psql dump:
https://github.com/genome/gms/blob/ubuntu-12.04/setup/schema.psql

ALTER TABLE ONLY instance
    ADD CONSTRAINT instance_peer_instance_id_fkey FOREIGN KEY (peer_instance_id) REFERENCES instance(workflow_instance_id);

First question. Do we really want to create a foreign key constraint between 'peer_instance_id' of the 'instance' table and 'workflow_instance_id' of the same table? If we do want this, perhaps there is some compatibility issue between Oracle vs. PostGres self-referential integrity constraints?

Of course the error could be something else entirely that is violating a desired foreign key constraint and the RDBMS is simply creating a useful error here...

@malachig
Copy link
Collaborator

NOTES:

To get more verbose output on interactions with the database try setting one of these when running a somatic-variation build:
UR_DBI_MONITOR_DML=1
UR_DBI_MONITOR_SQL=1

The first will dump updates and inserts, the second will dump all SQL statements.

It is possible that this workflow foreign key constraint problem is related to the sorting of IDs at the time the insert is attempted and this causes a foreign key problem because in this case we have a self-referential foreign key constraint on the table "instance" of the schema "workflow". Both @amb43790 and @sakoht have reviewed the relevant UR code and at first glance this does not seem to be the case. However, it could still be something related to sorting prior to insert.

To further test this theory @amb43790 changed the workflow code and related workflow tables in the postgres schema to use numeric IDs. We then dumped the database, switched to the modified workflow code and attempted a new build. The same foreign key constraint error still occurred.

We next dropped the foreign key constraint entirely by logging into postgres on a test box and dropping the constraint as follows:

sudo -u postgres psql -d genome
ALTER TABLE workflow.instance DROP CONSTRAINT instance_peer_instance_id_fkey;

This seems to have worked and allows pindel and breakdancer parallel jobs to launch and complete successfully. The question remains. Why is this foreign key constraint being violated in the standalone GMS but not within TGI where the same workflows are being created and stored?

Within TGI, these tables are stored in Oracle. Could this be a difference between how the Oracle and Postgres RDBMSs handle self-referential foreign key constraints? I believe this is the only place in our schema with such a constraint. The current plan is to discuss this issue with TGI members more familiar with the workflow system, namely @davidlmorton

@ghost ghost assigned malachig Jan 16, 2014
@malachig
Copy link
Collaborator

To test possible resolutions to this problem we can log into clia1 and run a somatic-variation build as follows:

ssh clia1 #Log into clia box
sudo -i -u gmsuser #Switch to test user setup for running the GMS
genome model build view ffeb8660087945189028e52cd6320c06 #View an existing build that had these problems
genome model build start 2891407507 #Start a new build that will quickly error out in the same way

In the standalone GMS, relevant software is installed here:
/opt/gms/PEL8970/sw/genome/ #(we use the 'gms-pub' branch for standalone)
/opt/gms/PEL8970/sw/workflow/ #(we use the 'gms-pub' branch for standalone)
/opt/gms/PEL8970/sw/ur/ #(we use a 'gms-pub' tag for standalone)

Installation happens by running the Makefile here:
/home/gmsuser/gms/Makefile #See https://github.com/genome/gms for details

The schema is here:
/home/gmsuser/gms/setup/schema.psql

All tables are stored in a single postgres database (no Oracle). You can log into this as follows:
sudo -u postgres psql -d genome

To rebuild the database and create new builds totally from scratch you can do:
cd /home/gmsuser/gms
make db-rebuild
wget http://genome.wustl.edu/pub/software/gms/testdata/GMS1/export/18177dd5eca44514a47f367d9804e17a-2014.1.15.dat
setup/restore_original_tst1_data.pl
genome sys gateway attach GMS1 --protocol ftp --rsync
setup/use_sampled_tst1_data.pl
genome model import metadata 18177dd5eca44514a47f367d9804e17a-2014.1.15.dat
genome model build start "name='hcc1395-normal-snparray'"
genome model build start "name='hcc1395-tumor-snparray'"
genome model build start "name='hcc1395-normal-refalign-exome'"
genome model build start "name='hcc1395-tumor-refalign-exome'"
genome model build start "name='hcc1395-somatic-exome'"

@brummett
Copy link

Looks like it was a casing-problem with the schema/table names. The schema creation script creates all the tables with lower cased names, but the table_name class attribute for the Workflow classes is mixed like "workflow.TABLE_NAME". So, when it was looking for the foreign key constraints for the workflow.instance table, it wasn't finding any, and the inserts were done in the wrong order.

It looks like I've been able to fix it by changing the Workflow MetaDB info to always use lower case schema. table and column names, to match the schema build script, and edit the workflow class' table_name attribute to have everything lower cased (workflow.table_name).

Build 02a06c952908411bbad14cf2f9552769 is running now. Many parallel steps have completed successfully.

When it finishes (or dies from some unrelated problem), I'll push my fix for Workflow's MetaDB to the gms-pub branch of Workflow.

@brummett
Copy link

Build 02a06c952908411bbad14cf2f9552769 crashed in its final step "Annotate And Upload Variants" because the annotator couldn't find a file it was looking for: /gscmnt/ams1100/info/v37_ucsc_conservation/chr1-rec

I'd say the FK problem is fixed. I've pushed a commit to the gms-pub branch of the workflow repo

@obigriffith
Copy link
Collaborator

Awesome! That is an unrelated issue described here:
#72

@gatoravi gatoravi reopened this Mar 16, 2015
@gatoravi
Copy link
Contributor Author

I see a similar issue with Pindel in the somatic-variation workflow,

    451 2015-03-16 17:29:40+0000 ip-172-31-19-8: DBD::Pg::st execute failed: ERROR:  insert or update on table "instance" violates foreign key constraint "instance_peer_instance_id_fkey"
    452 2015-03-16 17:29:40+0000 ip-172-31-19-8: DETAIL:  Key (peer_instance_id)=(cb0287f34d074a78877863ccb749dedc) is not present in table "instance". at /opt/gms/RGTBV26/sw/ur/lib/UR/DBI.§
    453 2015-03-16 17:29:40+0000 ip-172-31-19-8: 2015/03/16 17:29:40 Workflow::DataSource::InstanceSchema id(Workflow::DataSource::InstanceSchema): Workflow::DataSource::InstanceSchema: Err§
    454 2015-03-16 17:29:40+0000 ip-172-31-19-8:  INSERT INTO workflow.instance (workflow_plan_id,workflow_instance_id,peer_instance_id,parent_instance_id,parent_execution_id,parallel_index§
    455 2015-03-16 17:29:40+0000 ip-172-31-19-8: PARAMS: 'e09177aaa6784d5884ea498e4d1a6558', 'fe9cf90e320d44048342597d075a7b75', 'cb0287f34d074a78877863ccb749dedc', (undef), (undef), '14', §
    456 2015-03-16 17:29:40+0000 ip-172-31-19-8:  024c6f37ca0f44d89ea714ec3a515c65^@^@^@^Rprevious_result_id^D^C^@^@^@^B^D^Q&Genome::Model::Build::SomaticVariation^C^@^@^@^P

@gatoravi
Copy link
Contributor Author

On second thought, this might need a new issue actually. Opening #181

@gatoravi
Copy link
Contributor Author

Note - The fix for this old issue is here, genome/tgi-workflow@28c525a

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants