Skip to content

Commit

Permalink
[MINOR][DOCS] Fix some typos in docs
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
The pr aims to fix some typos in some docs, includes: `docs/sql-ref-syntax-qry-star.md`, `docs/running-on-kubernetes.md` and `connector/profiler/README.md`.

### Why are the changes needed?
https://spark.apache.org/docs/4.0.0-preview1/sql-ref-syntax-qry-star.html
In some `sql examples` in the doc `docs/sql-ref-syntax-qry-star.md`, `Unicode Character 'SINGLE QUOTATION MARK'` was used, which resulted in the end-user being unable to execute successfully after `copy-paste`, eg:
<img width="660" alt="image" src="https://github.com/apache/spark/assets/15246973/055aa0a8-602e-4ea7-a065-c8e0353c6fb3">

### Does this PR introduce _any_ user-facing change?
Yes, the end-users will face more user-friendly docs.

### How was this patch tested?
Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47261 from panbingkun/fix_typo_docs.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
  • Loading branch information
panbingkun authored and HyukjinKwon committed Jul 9, 2024
1 parent 18f2450 commit 6edfd66
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 11 deletions.
4 changes: 2 additions & 2 deletions connector/profiler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ To build

## Executor Code Profiling

The spark profiler module enables code profiling of executors in cluster mode based on the the [async profiler](https://github.com/async-profiler/async-profiler/blob/v2.10/README.md), a low overhead sampling profiler. This allows a Spark application to capture CPU and memory profiles for application running on a cluster which can later be analyzed for performance issues. The profiler captures [Java Flight Recorder (jfr)](https://access.redhat.com/documentation/es-es/red_hat_build_of_openjdk/17/html/using_jdk_flight_recorder_with_red_hat_build_of_openjdk/openjdk-flight-recorded-overview) files for each executor; these can be read by many tools including Java Mission Control and Intellij.
The spark profiler module enables code profiling of executors in cluster mode based on the [async profiler](https://github.com/async-profiler/async-profiler/blob/v3.0/README.md), a low overhead sampling profiler. This allows a Spark application to capture CPU and memory profiles for application running on a cluster which can later be analyzed for performance issues. The profiler captures [Java Flight Recorder (jfr)](https://access.redhat.com/documentation/es-es/red_hat_build_of_openjdk/17/html/using_jdk_flight_recorder_with_red_hat_build_of_openjdk/openjdk-flight-recorded-overview) files for each executor; these can be read by many tools including Java Mission Control and Intellij.

The profiler writes the jfr files to the executor's working directory in the executor's local file system and the files can grow to be large so it is advisable that the executor machines have adequate storage. The profiler can be configured to copy the jfr files to a hdfs location before the executor shuts down.
The profiler writes the jfr files to the executor's working directory in the executor's local file system and the files can grow to be large, so it is advisable that the executor machines have adequate storage. The profiler can be configured to copy the jfr files to a hdfs location before the executor shuts down.

Code profiling is currently only supported for

Expand Down
2 changes: 1 addition & 1 deletion docs/running-on-kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -1875,7 +1875,7 @@ Spark allows users to specify a custom Kubernetes schedulers.
```

##### Build
To create a Spark distribution along with Volcano suppport like those distributed by the Spark [Downloads page](https://spark.apache.org/downloads.html), also see more in ["Building Spark"](https://spark.apache.org/docs/latest/building-spark.html):
To create a Spark distribution along with Volcano support like those distributed by the Spark [Downloads page](https://spark.apache.org/downloads.html), also see more in ["Building Spark"](https://spark.apache.org/docs/latest/building-spark.html):

```bash
./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive -Phive-thriftserver -Pkubernetes -Pvolcano
Expand Down
16 changes: 8 additions & 8 deletions docs/sql-ref-syntax-qry-star.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,31 +59,31 @@ except_clause

```sql
-- Return all columns in the FROM clause
SELECT * FROM VALUES(1, 2) AS TA(c1, c2), VALUES(‘a’, b’) AS TB(ca, cb);
SELECT * FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
1 2 a b

-- Return all columns from TA
SELECT TA.* FROM VALUES(1, 2) AS TA(c1, c2), VALUES(‘a’, b’) AS TB(ca, cb);
SELECT TA.* FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
1 2

-- Return all columns except TA.c1 and TB.cb
SELECT * EXCEPT (c1, cb) FROM VALUES(1, 2) AS TA(c1, c2), VALUES(‘a’, b’) AS TB(ca, cb);
SELECT * EXCEPT (c1, cb) FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
2 a

-- Return all columns, but strip the field x from the struct.
SELECT TA.* EXCEPT (c1.x) FROM VALUES(named_struct(‘x’, x, ‘y’, ‘y’), 2) AS (c1, c2), VALUES(‘a’, b’) AS TB(ca, cb);
SELECT TA.* EXCEPT (c1.x) FROM VALUES(named_struct('x', x, 'y', 'y'), 2) AS (c1, c2), VALUES('a', 'b') AS TB(ca, cb);
{ y } 2 a b

-- Return the first not-NULL column in TA
SELECT coalesce(TA.*) FROM VALUES(1, 2) AS TA(c1, c2), VALUES(‘a’, b’) AS TB(ca, cb);
SELECT coalesce(TA.*) FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
1

-- Return 1 if any column in TB contains a ‘c’.
SELECT CASE WHEN ‘c’ IN (TB.*) THEN 1 END FROM VALUES(1, 2) AS TA(c1, c2), VALUES(‘a’, b’) AS TB(ca, cb);
-- Return 1 if any column in TB contains a 'c'.
SELECT CASE WHEN 'c' IN (TB.*) THEN 1 END FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
NULL

-- Return all column as a single struct
SELECT (*) FROM VALUES(1, 2) AS TA(c1, c2), VALUES(‘a’, b’) AS TB(ca, cb);
SELECT (*) FROM VALUES(1, 2) AS TA(c1, c2), VALUES('a', 'b') AS TB(ca, cb);
{ c1: 1, c2: 2, ca: a, cb: b }

-- Flatten a struct into individual columns
Expand Down

0 comments on commit 6edfd66

Please sign in to comment.