-
-
Notifications
You must be signed in to change notification settings - Fork 75
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'mdazfar2:main' into main
- Loading branch information
Showing
8 changed files
with
423 additions
and
36 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
# Apache Spark Installation 🚀 | ||
|
||
Welcome to the Apache Spark installation and usage guide! This guide will walk you through the steps to install and use Apache Spark on your local machine. Let's get started! 🌟 | ||
|
||
## Table of Contents | ||
1. Prerequisites | ||
2. Downloading Apache Spark | ||
3. Installing Apache Spark | ||
4. Setting Up Environment Variables | ||
5. Running Spark Applications | ||
6. Using PySpark | ||
7. Conclusion | ||
|
||
## Prerequisites 📋 | ||
|
||
Before installing Apache Spark, ensure you have the following software installed on your machine: | ||
|
||
- Java Development Kit (JDK) 8 or later ☕ | ||
- Scala (optional, but recommended for Scala applications) 💻 | ||
- Python (if you plan to use PySpark) 🐍 | ||
|
||
You can check if Java is installed by running the following command: | ||
|
||
```bash | ||
java -version | ||
``` | ||
|
||
## Downloading Apache Spark 📥 | ||
|
||
1. Visit the [Apache Spark download page](https://spark.apache.org/downloads.html). | ||
2. Choose the latest version of Spark. | ||
3. Select a pre-built package for Hadoop. If you're unsure, choose "Pre-built for Apache Hadoop 2.7 and later". | ||
4. Click on the "Download Spark" link. | ||
|
||
## Installing Apache Spark 💾 | ||
|
||
1. Extract the downloaded Spark tarball: | ||
|
||
```bash | ||
tar -xvf spark-<version>-bin-hadoop2.7.tgz | ||
``` | ||
|
||
2. Move the extracted directory to `/opt` (optional): | ||
|
||
```bash | ||
sudo mv spark-<version>-bin-hadoop2.7 /opt/spark | ||
``` | ||
|
||
## Setting Up Environment Variables ⚙️ | ||
|
||
1. Open your `.bashrc` or `.zshrc` file: | ||
|
||
```bash | ||
nano ~/.bashrc | ||
# or | ||
nano ~/.zshrc | ||
``` | ||
|
||
2. Add the following lines to set up the Spark environment variables: | ||
|
||
```bash | ||
export SPARK_HOME=/opt/spark | ||
export PATH=$SPARK_HOME/bin:$PATH | ||
``` | ||
|
||
3. Source the updated profile: | ||
|
||
```bash | ||
source ~/.bashrc | ||
# or | ||
source ~/.zshrc | ||
``` | ||
|
||
## Running Spark Applications 🏃 | ||
|
||
To verify that Spark is installed correctly, you can run the Spark shell: | ||
|
||
```bash | ||
spark-shell | ||
``` | ||
|
||
You should see the Spark shell prompt, indicating Spark is ready to use. | ||
|
||
## Using PySpark 🐍 | ||
|
||
If you plan to use Spark with Python, you can use PySpark. Here's how to start the PySpark shell: | ||
|
||
```bash | ||
pyspark | ||
``` | ||
|
||
This will open an interactive PySpark shell where you can run Spark commands using Python. | ||
|
||
### Example PySpark Application 📘 | ||
|
||
Create a new Python file, `example.py`, with the following content: | ||
|
||
```python | ||
from pyspark.sql import SparkSession | ||
|
||
# Initialize SparkSession | ||
spark = SparkSession.builder.appName("example").getOrCreate() | ||
|
||
# Create a DataFrame | ||
data = [("Alice", 34), ("Bob", 45), ("Cathy", 29)] | ||
columns = ["Name", "Age"] | ||
df = spark.createDataFrame(data, columns) | ||
|
||
# Show the DataFrame | ||
df.show() | ||
|
||
# Stop the SparkSession | ||
spark.stop() | ||
``` | ||
|
||
Run the script using the following command: | ||
|
||
```bash | ||
spark-submit example.py | ||
``` | ||
|
||
You should see the DataFrame output in your terminal. | ||
|
||
## Conclusion 🎉 | ||
|
||
Congratulations! You've successfully installed and run Apache Spark on your machine. You are now ready to start building powerful big data applications. For more information and advanced usage, refer to the [official Apache Spark documentation](https://spark.apache.org/docs/latest/). | ||
|
||
Happy Spark-ing! ✨ | ||
|
||
--- | ||
|
||
This guide should help you get started with Apache Spark quickly and easily. Let me know if there's anything else you need! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
# 🚀 Setting Up a CI/CD Pipeline Using GitLab | ||
|
||
## Table of Contents | ||
1. 🔧 Prerequisites | ||
2. 📂 Project Setup | ||
3. 🛠️ Configure GitLab Runner | ||
4. 📄 Create `.gitlab-ci.yml` File | ||
5. 🔄 Continuous Integration (CI) Configuration | ||
6. 🚢 Continuous Deployment (CD) Configuration | ||
7. ✅ Running the Pipeline | ||
|
||
--- | ||
|
||
## 🔧 Prerequisites | ||
|
||
Before setting up the CI/CD pipeline, ensure you have the following: | ||
- A GitLab account 🧑💻 | ||
- A GitLab project repository 📁 | ||
- GitLab Runner installed (optional for local testing) 🏃♂️ | ||
|
||
## 📂 Project Setup | ||
|
||
1. **Create a New Repository**: | ||
- Go to your GitLab account. | ||
- Click on **New Project** and follow the instructions to create a new repository. | ||
|
||
2. **Clone the Repository**: | ||
```sh | ||
git clone https://gitlab.com/username/repository-name.git | ||
cd repository-name | ||
``` | ||
|
||
## 🛠️ Configure GitLab Runner | ||
|
||
1. **Register GitLab Runner**: | ||
- Install GitLab Runner by following the [official documentation](https://docs.gitlab.com/runner/install/). | ||
- Register the runner using: | ||
```sh | ||
sudo gitlab-runner register | ||
``` | ||
- Follow the prompts to configure the runner. | ||
|
||
## 📄 Create `.gitlab-ci.yml` File | ||
|
||
1. **Create the File**: | ||
- In the root of your repository, create a file named `.gitlab-ci.yml`. | ||
```sh | ||
touch .gitlab-ci.yml | ||
``` | ||
|
||
2. **Basic Structure**: | ||
```yaml | ||
stages: | ||
- build | ||
- test | ||
- deploy | ||
``` | ||
|
||
## 🔄 Continuous Integration (CI) Configuration | ||
|
||
1. **Build Stage**: | ||
```yaml | ||
build: | ||
stage: build | ||
script: | ||
- echo "Compiling the code..." | ||
- # Add your build commands here | ||
``` | ||
|
||
2. **Test Stage**: | ||
```yaml | ||
test: | ||
stage: test | ||
script: | ||
- echo "Running tests..." | ||
- # Add your test commands here | ||
``` | ||
|
||
## 🚢 Continuous Deployment (CD) Configuration | ||
|
||
1. **Deploy Stage**: | ||
```yaml | ||
deploy: | ||
stage: deploy | ||
script: | ||
- echo "Deploying the application..." | ||
- # Add your deployment commands here | ||
environment: | ||
name: production | ||
url: http://your-app-url.com | ||
``` | ||
|
||
## ✅ Running the Pipeline | ||
|
||
1. **Push Changes**: | ||
- Add, commit, and push your changes to the repository. | ||
```sh | ||
git add .gitlab-ci.yml | ||
git commit -m "Add CI/CD pipeline configuration" | ||
git push origin main | ||
``` | ||
|
||
2. **Pipeline Execution**: | ||
- Navigate to your GitLab project. | ||
- Go to **CI/CD > Pipelines** to view the running pipeline. | ||
|
||
3. **Review Pipeline Status**: | ||
- Check the status of each stage and job. | ||
- Fix any issues if necessary and re-run the pipeline. | ||
|
||
--- | ||
|
||
Congratulations! 🎉 You have successfully set up a CI/CD pipeline using GitLab. Your project is now configured to automatically build, test, and deploy with each commit. | ||
|
||
--- | ||
|
||
Feel free to further customize your pipeline to suit your project's specific needs. Happy coding! 💻🚀 | ||
--- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.