Skip to content

Commit

Permalink
Merge pull request #3395 from ajdapretnar/transform-docs
Browse files Browse the repository at this point in the history
Transform docs
  • Loading branch information
BlazZupan authored Nov 23, 2018
2 parents 66e52c5 + da60a35 commit 3b31631
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 3 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions doc/visual-programming/source/widgets/data/transform.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,24 @@ Inputs
Outputs
Transformed Data
transformed dataset


**Transform** maps new data into a transformed space. For example, if we transform some data with PCA and wish to observe new data in the same space, we can use transform to map the new data into the PCA space created from the original data.

.. figure:: images/Transform.png

Widget accepts new data on the input and a preprocessor that was used to transform the old data.

Example
-------

We will use iris data from the **File** widget for this example. To create two separate data sets, we will use **Select Rows** and set the condition to *iris is one of iris-setosa, iris-versicolor*. This will output a data set with a 100 rows, half of them belonging to iris-setosa class and the other half to iris-versicolor.

We will transform the data with **PCA** and select the first two components, which explain 96% of variance. Now, we would like to apply the same preprocessing on the 'new' data, that is the remaining 50 iris virginicas. Send the unused data from **Select Rows** to **Transform**. Make sure to use the *Unmatched Data* output from **Select Rows** widget. Then add the *Preprocessor* output from **PCA**.

**Transform** will apply the preprocessor to the new data and output it. To add the new data to the old data, use **Concatenate**. Use *Transformed Data* output from **PCA** as *Primary Data* and *Transformed Data* from **Transform** as *Additional Data*.

Observe the results in a **Data Table** or in a **Scatter Plot** to see the new data in relation to the old one.

.. figure:: images/Transform-Example.png

6 changes: 3 additions & 3 deletions doc/visual-programming/source/widgets/unsupervised/tsne.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,11 @@ The **t-SNE** widget plots the data with a t-distributed stochastic neighbor emb
Example
-------

We will use :doc:`Single Cell Datasets<./singlecelldatasets>` widget to load *Bone marrow mononuclear cells with AML (sample)* data. Then we will pass it through **k-Means** and select 2 clusters from Silhouette Scores. Ok, it looks like there might be two distinct clusters here.
We will use **Single Cell Datasets** widget to load *Bone marrow mononuclear cells with AML (sample)* data. Then we will pass it through **k-Means** and select 2 clusters from Silhouette Scores. Ok, it looks like there might be two distinct clusters here.

But can we find subpopulations in these cells? Let us load *Bone marrow mononuclear cells with AML (markers)* with :doc:`Single Cell Datasets<./singlecelldatasets>`. Now, pass the marker genes to **Data Table** and select, for example, natural killer cells from the list (NKG7).
But can we find subpopulations in these cells? Let us load *Bone marrow mononuclear cells with AML (markers)* with **Single Cell Datasets**. Now, pass the marker genes to **Data Table** and select, for example, natural killer cells from the list (NKG7).

Pass the markers and k-Means results to :doc:`Score Cells<./scorecells>` widget and select *geneName* to match markers with genes. Finally, add **t-SNE** to visualize the results.
Pass the markers and k-Means results to **Score Cells** widget and select *geneName* to match markers with genes. Finally, add **t-SNE** to visualize the results.

In **t-SNE**, use *Scores* attribute to color the points and set their size. We see that killer cells are nicely clustered together and that t-SNE indeed found subpopulations.

Expand Down

0 comments on commit 3b31631

Please sign in to comment.