ClusterId and name are missing #219

akbarali789 · 2023-07-18T12:38:20Z

Hi Team,

I took code solution from spark-monitor branch l4jv2 to support our custom logging after our ADB version upgrade to 12.2 LTS.

Issue: Except metrics starts with "app" in SparkMetric_CL table, I could not find cluster id and name details.

Issue: Logs are not loading into "SparkListenerEvent_CL" table with latest solution.

Please help me resolve these issues as our report is now not working. Thanks in advance.

infosuresh2k · 2023-07-31T21:12:23Z

spark-monitor branch l4jv2 to support custom logging after DBR version upgrade to 12.2 is missing lot of columns(cluster id, name, application id etc.,) and mappings are wrong as well. I am not sure why the below new columns are introduced in the new version which supposed to be mapped to the existing columns.

10.4 vs 11.0 and above mappings list.
Level vs log_level_s, thread_name_s vs process_thread_name_s, logger_name_s vs log_logger_s, applicationName_s vs sparkAppName_s, nodeType_s vs sparkNode_s

vandanakravi-tfm · 2023-08-23T10:08:44Z

@infosuresh2k @akbarali789 Im also facing the same issue with SparkLoggingEvent_CL. Have you got the solution for this or any workarounds.

gustavomcarmo · 2023-10-03T11:46:04Z

I've read somewhere that we can define the columns/values sent to the Log Analytics workspace by changing the content of sparkLayout.json. The columns can be based on Spark Properties listed in the Environment tab of the Spark UI, provided by the Databricks cluster UI, as shown below:

I've tried then to merge as much as possible the columns already used by the implementation covering Spark versions older than 3.3.x, ending up this:

cat << 'EOF' > "$STAGE_DIR/sparkLayout.json"
{
  "@timestamp": {
    "$resolver": "timestamp",
    "pattern": {
      "format": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",
      "timeZone": "UTC"
    }
  },
  "level": {
    "$resolver": "level",
    "field": "name"
  },
  "message": {
    "$resolver": "message",
    "stringified": true
  },
  "thread.name": {
    "$resolver": "thread",
    "field": "name"
  },
  "logger.name": {
    "$resolver": "logger",
    "field": "name"
  },
  "labels": {
    "$resolver": "mdc",
    "flatten": true,
    "stringified": true
  },
  "tags": {
    "$resolver": "ndc"
  },
  "error.type": {
    "$resolver": "exception",
    "field": "className"
  },
  "error.message": {
    "$resolver": "exception",
    "field": "message"
  },
  "error.stack_trace": {
    "$resolver": "exception",
    "field": "stackTrace",
    "stackTrace": {
      "stringified": true
    }
  },
  "applicationId": "${spark:spark.app.id:-}",
  "applicationName": "${spark:spark.app.name:-}",
  "nodeType": "${spark:nodeType}",
  "clusterId": "${spark:spark.databricks.clusterUsageTags.clusterId:-}",
  "clusterName": "${spark:spark.databricks.clusterUsageTags.clusterName:-}"
}
EOF

The columns clusterId and clusterName are there.

This was referenced Oct 3, 2023

SparkLoggingEvent_CL column values are NULL #221

Open

SparkLoggingEvent_CL columns are wrongly mapped #222

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClusterId and name are missing #219

ClusterId and name are missing #219

akbarali789 commented Jul 18, 2023

infosuresh2k commented Jul 31, 2023

vandanakravi-tfm commented Aug 23, 2023 •

edited

Loading

gustavomcarmo commented Oct 3, 2023

ClusterId and name are missing #219

ClusterId and name are missing #219

Comments

akbarali789 commented Jul 18, 2023

infosuresh2k commented Jul 31, 2023

vandanakravi-tfm commented Aug 23, 2023 • edited Loading

gustavomcarmo commented Oct 3, 2023

vandanakravi-tfm commented Aug 23, 2023 •

edited

Loading