Skip to content

Releases: MarquezProject/marquez

Marquez 0.50.0

24 Oct 08:23
Compare
Choose a tag to compare

Added

  • Web: New Data Observability dashboard for stats on OpenLineage events (24hrs, past 7.days); views are also available for sources, datasets, and jobs; new job list view has also been introduced displaying the latest N runs (and duration) for a given job #2913 @phixMe
  • Web: 404 page #2890 @phixMe
  • Web: Display parent job (if present) in job panel #2868 @phixMe
  • Web: Allow override of web.port via WEB_PORT environment variable #2838 @bidlako
  • Web: Allow nullable columns for schema in dataset panel (use N/A) #2896 @phixMe
  • Web: Better feedback when lineage events are loading #2916 @NisargChokshi45
  • API: Job object will now return Job.latestRuns (latest N runs) and Job.latestRun (last run to execute) #2901 @phixMe
  • API: Use io.openlineage.server.* pkg and class Metadata (utility class for OpenLineage.RunEvent) #2853 @wslulciuc
  • API: Use TIMESTAMPTZ for timestamps in database; supports Data Observability dashboard with timezone of user #2924 @wslulciuc
  • API: Set current_run_uuid in table jobs optimizing query for JobDao.findAll() #2929 @wslulciuc
  • API: New GET /api/v1/jobs #2930 @wslulciuc
  • CLI: New cmd args for cli.MetadataCommand #2923 @wslulciuc
    • --jobs: limits OL jobs up to N (default: 5)
    • --runs-per-job: limits OL run executions per job up to N (default: 10)
    • --runs-active: limits OL run executions marked as active (='RUNNING') up to N
    • --max-run-fails-per-job: maximum OL run fails per job (default: 2)
    • --min-run-duration: minimum OL run duration (in seconds) per execution (default: 300)
    • --run-start-time: specifies the OL run start time in UTC ISO ('YYYY-MM-DDTHH:MM:SSZ'); used for the initial OL run, with subsequent runs starting relative to the initial start time. (default: 2024-10-15T01:00:11.080828Z)
    • --run-end-time: specifies the OL run end time in UTC ISO ('YYYY-MM-DDTHH:MM:SSZ'); used for the initial OL run, with subsequent runs ending relative to the initial end time. (default: 2024-10-15T01:07:25.080828Z)

Fixed

  • Web: Better rendering of long text #2942 @phixMe
  • Web: Display full runID and check icon when copied #2940 #2941 @wslulciuc @phixMe
  • Web: Use DatasetVersionAPI to display latest schema and remove extra job facets API call in dataset panel #2938 @phixMe
  • Web: Use DatasetAPI for data quality assertions in dataset panel #2937 @phixMe
  • Web: Fill-in job node in lineage graph with correct color for JobEvents #2934 @phixMe
  • Web: Fill-in job node in lineage graph with correct color for run states RUNNING, COMPLETED, etc #2897 @phixMe
  • API: Pagination for DatasetVersion.findAll(); not all dataset versions were returned for GET /api/v1/namespaces/{namespace}/datasets/{dataset}/versions #2944 @inanalper
  • API: null namespace and dataset name in view dataset_view for old versions; use table dataset_versions instead in column lineage query #2881 @sophiely
  • API: Missing DELETE CASCADE on table job_facets #2878 @mattwparas
  • API: Ensure Job.latestRun in Job object is set for runs in a RUNNING state; before Job.latestRun was set only for a run in a done state (COMPLETED / FAILED) #2933 @phixMe
  • CLI: Repurpose cmd db-migrate to run all pending database migrations, no longer coupling migrations with HTTP server startup #2936 @davidjgoss
  • Chart: Missing common labels for deployment.replicas #2877 @alaturqua

Marquez 0.49.0

07 Aug 16:25
Compare
Choose a tag to compare

Added

  • API: Job-to-Job lineage #2752 @yanlibert
    Intended in part to spur a larger discussion of full parent/child hierarchy handling in Marquez. Changes only the backend API, adding the Job UUID along with the parent name to the Job metadata returned.

Fixed

  • Web: security updates #2864 @phixMe
    Resolves critical security issues found using NPM's audit command.
  • Web: encode Job name in API requests #2866 @dolfinus
    Urlencodes Job, Dataset, tag and field names while sending an API request.

Marquez 0.48.0

05 Aug 18:49
Compare
Choose a tag to compare

Added

  • API: add endpoint method and path to metrics name #2850 @JDarDagran
    In the metrics endpoint, there was information gathered containing the SQL Object name and method name. This introduces labels (DAO name, DAO method, endpoint method, endpoint path) and adds more information about endpoints.
  • API: add paging to dataset versions panel #2855 @davidsharp7
    Adds Datasets paging.
  • API: add paging on Jobs panel #2852 @davidsharp7
    Adds Job-level paging of Runs.
  • API: add Dataset schema versions #2763 @davidjgoss
    Adds Dataset schema versions to the model and enables writing to it.
  • Docker: make db port configurable via POSTGRES_PORT #2751 @merobi-hub
    Adds support for easy db port reassignment.
  • Java: allow customization of Apache HTTP in Java client #2822 @davidjgoss
    Allows customization of Apache HTTP in Java client.
  • Web: add Job tagging to UI #2837 @davidsharp7
    Adds Job tagging to the UI.
  • Web: source code facets #2833 @phixMe
    Adds typedef and rendering of the sourceCode facet for a Job if available.

Fixed

  • API: Dataset query to get only the latest facet for each version #2859 @sophiely
    The facet partition is ranked by Dataset version and facet name so as we can take only the most recent facet for each Dataset UUID and type.
  • API: optimize column lineage query performance #2821 @vinhnemo
    Adds a filter condition to the CTE dataset_fields_view in ColumnLineageDao.java.
  • Web: deduplicate the versions displayed #2854 @namyyys
    Excludes the symlinks from the result of the query displaying the version history in order to exclude duplicate versions.
  • Web: clean up issues highlighted by some Spark Integration Data #2856 @phixMe
    Fixes numerous issues in our interfaces related to some OpenLineage Spark events.
  • Web: remove limit from assertion evaluation #2844 @phixMe
    Fixes bug where our status indicator was the wrong color.
  • Web: bring Dataset tags into line with Job Tags #2841 @davidsharp7
    Brings Dataset tags into line with Job tags.
  • Web: fix scroll issues for drawer and home pages #2820 @phixMe
    Scrolling improvements for drawer and home pages.
  • Web: fix search endpoint parameters #2818 @Nisarg-Chokshi
    The search API parameters were not getting updated correctly on changing the filter and sort options.

Removed

  • Web: DRY paging #2832 @phixMe
    Removes repeated code for paging on lineage events, jobs and datasets.

Marquez 0.47.0

17 May 16:33
Compare
Choose a tag to compare

Added


Data Quality and Job Status Display in Marquez Web


  • API: add job tagging to API #2774 @davidsharp7
    Adds support for job tagging to the API.
  • Chart: add serviceAccountand extraContainers to helm chart values #2766 @kostas-theo
    To make the Kubernetes service account configurable, adds these values to the helm chart values with defaults set to maintain current functionality.
  • Client/Java: add jobVersion field to Run in Java client #2808 @davidjgoss
    Adds jobVersion field to Run in Java client.
  • Docker: improve down.sh script #2778 @dolfinus
    Adds new -v option and fixes down.sh script to rely on docker-compose down -v and make volume deletion optional.
  • Web: tooltips and display updates #2809 @phixMe
    Updates tooltips to be more modernized and custom.
  • Web: update JSON theme #2807 @phixMe
    Makes the JSON theme more in-line with the Marquez brand.
  • Web: column lineage linking and sticky tab titles #2805 @phixMe
    Adds sticky Titles and moves column lineage links to the table definition.
  • Web: refine panel feature set #2798 @phixMe
    Adds many refinements in response to user feedback.
  • Web: update dataset/dataset field-tagging experience #2761 @davidsharp7
    Adds support for adding multiple tags at once, introduces a switch to allow field-level tags to be exposed, and fixes refresh for an improved field-tagging experience.
  • Web: web refresh + loading states #2779 @phixMe
    Adds a refresh button for jobs, datasets, and lineage events pages. This also will work in empty states.

Removed

  • Web: remove old files and dependencies #2801 @phixMe
    Drops deps and removes unused React components no longer required by the new lineage graph.

Fixed

  • API: adapt column lineage query for symlink dataset #2775 @sophiely
    Changes the column lineage query in order to take only the 'main' dataset, not the dataset created via symlink.
  • Web: resolve issue data quality assertion facet are not displayed #2528 @sophiely
    Fixes rendering of the DataQualityAssertion facet by adding support for dataset, unknown and input.
  • Web: fix showTags refresh #2799 @davidsharp7
    Adds showTags to the dependencies of fetchDatasetVersions and disables the show tags toggle until the latest version has been pulled.
  • Web: various dataset tags improvements #2813 @davidsharp7
    Various tag improvements including a carat for the dropdown.
  • Web: use Webpack-bundled icon instead of GitHub-hosted content #2803 @dodo0822
    For compliance with a strict CSP, replaces the icon with an SVG bundled by Webpack instead of linking to raw.githubusercontent.com.

Marquez 0.46.0

15 Mar 19:12
Compare
Choose a tag to compare

Changed

  • Web: various revisions #2770 @phixMe
    Includes clean up of issues in the UI and removal of non-useful elements.

Fixed

  • Streaming API: fix behaviour for COMPLETE/FAIL events within streaming jobs #2768 @pawel-big-lebowski
    New job_version is not created for a streaming job terminal event with no dataset information and existing version is kept.

Marquez 0.45.0

07 Mar 20:13
Compare
Choose a tag to compare

Added


Redesigned Web UI Featuring Column Lineage

  • Web: updates to Table and Column Lineage #2725 @phixMe
    A new page for column lineage and an updated view for lineage with a common set of shared principles.
  • Web: quality of life updates for new lineage graph display #2750 @phixMe
    Visual updates from early feedback on lineage graph navigation, including a zoom button to center on the selected node.
  • Web: improve visual display of lineage #2753 @phixMe
    Visual improvements to nodes including the addition of more detail and the ability to collapse dataset nodes manually.

  • Web: add dataset field level tags to UI #2729 @davidsharp7.
    Updates to the DatasetTags component to allow for field-level tagging/deletion and addition of this to the DatasetInfo component.
  • Web: update dataset tags to allow editing/addition of tags #2759 @davidsharp7
    Updates to DatasetTags to include a split button menu and a new dialog/reducer for adding new tags.
  • Web: minor dataset tags revisions #2754 @phixMe
    Minor cleanup of the dataset tags feature including a pointer on the expandable row and a transition on row expansion, plus some new CSS elements.

Fixed

  • Web: minor UI enhancements #2727 @phixMe
    Hygienic cleanup of project as a follow-up to #2725, including a fix for #2747.
  • Web: fix symlink display #2736 @sophiely
    Changed behavior to display the symlink dataset in the previously empty namespace and link the symlink dataset lineage to the main dataset.

Marquez 0.45.0-rc.1

13 Feb 23:26
Compare
Choose a tag to compare

Added

  • Web: updates for Table and Column Lineage #2725 @phixMe
    Creates a new page for column lineage and an updated view for lineage with a common set of shared principles.
  • Web: add dataset field level tags to UI #2729 @davidsharp7
    Updates the DatasetTags component to allow for field-level tagging/deletion and adds this to the DatasetInfo component.

Fixed

  • Web: minor UI enhancements #2727 @phixMe
    Hygienic cleanup of project as a follow-up to #2725, including a fix for #2747.
  • API: fill data in column lineage input nodes #2742 @JDarDagran @wslulciuc
    Fixes the issue of null output nodes in the column lineage endpoint.

Marquez 0.44.0

25 Jan 20:00
Compare
Choose a tag to compare

Added

  • Web: add dataset tags tabs for adding/deleting of tags #2714 @davidsharp7
    Adds a dataset tags component so that datasets can have tags added/deleted.
  • API: Add endpoint to delete field-level tags #2705 @davidsharp7
    Adds delete endpoint to remove dataset field tags.

Fixed

  • Web: fix dataset tag reducers bug #2716 @davidsharp7
    Removes result from dataset tags reducer to fix a sidebar bug.

Marquez 0.43.1

20 Dec 20:19
Compare
Choose a tag to compare

Fixed

  • API: fix broken lineage graph for multiple runs of the same job #2710 @pawel-big-lebowski
    Problem: lineage graph was not available for jobs run multiple times of the same job as a result of bug introduced with recent release. In order to fix the inconsistent data, this UPDATE query should be run. This is not required when upgrading directly to 0.43.0.

Marquez 0.43.0

15 Dec 19:46
Compare
Choose a tag to compare

Added

  • API: refactor the RunDao SQL query #2685 @sophiely
    Improves the performance of the SQL query used for listing all runs.
  • API: refactor dataset version query #2683 @sophiely
    Improves the performance of the SQL query used for the dataset version.
  • API: add support for a DatasetEvent #2641 #2654 @pawel-big-lebowski
    Adds a feature for saving into the Marquez model datasets sent via the DatasetEvent event type. Includes optimization of the lineage query.
  • API: add support for a JobEvent #2661 @pawel-big-lebowski
    Adds a feature for saving into the Marquez model jobs and datasets sent via the JobEvent event type.
  • API: add support for streaming jobs #2682 @pawel-big-lebowski
    Creates job version and reference rows at the beginning of the job instead of on complete. Updates the job version within the run if anything changes.
  • API/spec: implement upstream run-level lineage #2658 @julienledem
    Returns the version of each job and dataset a run is depending on.
  • API: add DELETE endpoint for dataset tags #2698 @davidsharp7
    Creates a new endpoint for removing the linkage between a dataset and a tag in datasets_tag_mapping to supply a way to delete a tag from a dataset via the API.
  • Web: add a dataset drawer #2672 @davidsharp7
    Adds a drawer to the dataset column view in the GUI.

Fixed:

  • Client/Java: change url path encoding to match jersey decoding #2693 @davidjgoss
    Swaps out the implementation of MarquezPathV1::encode to use the UrlEscapers path segment escaper, which does proper URI encoding.
  • Web: fix pagination in the Jobs route #2655 @merobi-hub
    Hides job pagination in the case of no jobs.
  • Web: fix empty search experience #2679 @phixMe
    Use of the previous search value was resulting in a bad request for the first character of a search.

Removed:

  • Client/Java: remove maven-archiver dependency from the Java client #2695 @davidjgoss
    Removes a dependency from build.gradle that was bringing some transitive vulnerabilities.