Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3482 update output format #5

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

remo87
Copy link
Contributor

@remo87 remo87 commented Nov 4, 2024

  • add jq because polars can't handle the size of the file use subprocess to execute the jq
  • Refactor homologue transformer to accept binary input and generate temporary file for processing

@remo87 remo87 requested a review from javfg November 4, 2024 10:54
def transform(src: BinaryIO, dst: BinaryIO) -> None:
jq_command = '.genes | {"genes": map({id: .id, name: .name})}'

# Generate a temporary file path tp store the source file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Generate a temporary file path tp store the source file
# Generate a temporary file path to store the source file

Comment on lines 14 to 23
temp_path = Path(f'/tmp/{uuid4()}.json')

temp_path.write_bytes(src.read())

# Ejecuta jq con subprocess
result = subprocess.run(
['jq', jq_command, temp_path],
capture_output=True,
text=True
)
Copy link
Contributor

@javfg javfg Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
temp_path = Path(f'/tmp/{uuid4()}.json')
temp_path.write_bytes(src.read())
# Ejecuta jq con subprocess
result = subprocess.run(
['jq', jq_command, temp_path],
capture_output=True,
text=True
)
# Store the file and run jq on it
with tempfile.NamedTemporaryFile() as temp_file:
temp_file.write(src.read())
result = subprocess.run(
['jq', jq_command, temp_file.name],
capture_output=True,
text=True,
)

It's better to use tempfile when creating temporary files for security reasons.

)

# extract genes list
genes_list = inputGenes.explode('genes').unnest('genes')
# Verifica si el comando se ejecutó correctamente
Copy link
Contributor

@javfg javfg Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here!


# read id and name
output = genes_list.select(["id","name"])
# # read id and name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# # read id and name
# read id and name


# read id and name
output = genes_list.select(["id","name"])
# Ejecuta jq con subprocess
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep comments in english

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants