ParaRC

ParaRC can run in standalone mode, which we can test the parallel repair without the integration to a distributed storage system, and a HDFS3 integration mode, which we can test parallel repair with Hadoop-3.3.4. We first introduce the standalone mode, and then the HDFS3 integration mode.

To run ParaRC in the standalone mode, we need to finish the following steps:

Prepare a cluster in Alibaba Cloud
Compile ParaRC
Prepare configuration files for each machine
Generate the MLP in the offline
Generate a stripe of blocks
Start ParaRC
Test parallel repair

To run ParaRC in the HDFS3 integration mode, we need to finish the following steps:

Prepare a cluster in Alibaba Cloud *
Compile ParaRC
Deploy Hadoop-3.3.4 with OpenEC *
Prepare configuration files for each machine *
Generate the MLP in the offline
Generate a stripe of blocks in Hadoop-3.3.4 *
Start ParaRC
Test parallel repair

Note that the steps marked with * in the HDFS3 integration mode are different with those in the standalone mode.

Standalone Mode

Prepare a cluster in Alibaba Cloud

Cluster Configuration

Machine	Number	Alibaba Machine Type	IP
PRS Generator	1	ecs.r7.2xlarge	192.168.0.1
Controller	1	ecs.r7.xlarge	192.168.0.2
Agent	15	ecs.r7.xlarge	192.168.0.3; 192.168.0.4; 192.168.0.5; 192.168.0.6; 192.168.0.7; 192.168.0.8; 192.1;68.0.9; 192.168.0.10; 192.168.0.11; 192.168.0.12; 192.168.0.13; 192.168.0.14; 192.168.0.15; 192.168.0.16; 192.168.0.17;

Among the 15 agents, the last one (192.168.0.17) serves as the new node that replaces a failed node.

In each machine, we create a default username called pararc.

Pre-requisite

We need to install some third party libraries to compile and run ParaRC.

isa-l library

$> git clone https://github.com/01org/isa-l.git
$> cd isa-l/
$> ./autogen.sh
$> ./configure
$> make
$> sudo make install

cmake
```
$> sudo apt-get install cmake
```

Compile ParaRC

Please download the source code in /home/pararc/ParaRC. Then compile the source code by :

$> cd /home/pararc/ParaRC
$> ./compile.sh

Prepare configuration files for each machine

Each machine requires a configuration file /home/pararc/ParaRC/conf/sysSettings.xml. We show the configruation parameters in the following table:

Parameters	Description	Example
prsgenerator.addr	The IP address of the PRS Generator.	`192.168.0.1`
controller.addr	The IP address of the controller.	`192.168.0.2;`
agents.addr	The IP address of all agents.	`192.168.0.3; 192.168.0.4; 192.168.0.5; 192.168.0.6; 192.168.0.7; 192.168.0.8; 192.1;68.0.9; 192.168.0.10; 192.168.0.11; 192.168.0.12; 192.168.0.13; 192.168.0.14; 192.168.0.15; 192.168.0.16.`
fullnode.addr	The IP address of the new node.	`192.168.0.17;`
controller.thread.num	The number of threads in the controller.	`20`
agent.thread.num	The number of threads in each agent.	`20`
cmddist.thread.num	The number of threads to distribute the commands.	`10`
local.addr	The local IP address of a machine.	`192.168.0.2` for the controller; `192.168.0.3` for the first agent.
block.directory	The directory to store blocks.	`/home/pararc/ParaRC/blkDir` for standalone mode; `/home/pararc/hadoop-3.3.4-src/hadoop-dist/target/hadoop-3.3.4/dfs/data/current` for hdfs-3 integration mode.
stripestore.directory	The directory to store stripe metadata.	`/home/pararc/ParaRC/stripeStore`
tradeoffpoint.directory	The directory to store the MLP generated by PRS Generator offline.	`/home/pararc/ParaRC/tradeoffPoint`
pararc.mode	The mode of ParaRC	`standalone` for standalone mode; `hdfs3` for hdfs-3 integration mode.

Here is a sample of the configuration file in the controller in /home/pararc/ParaRC/conf/sysSettings.xml

 <setting>
 <attribute><name>prsgenerator.addr</name><value>192.168.0.1</value></attribute>
 <attribute><name>controller.addr</name><value>192.168.0.2</value></attribute>
 <attribute><name>agents.addr</name>
 <value>192.168.0.3</value>
 <value>192.168.0.4</value>
 <value>192.168.0.5</value>
 <value>192.168.0.6</value>
 <value>192.168.0.7</value>
 <value>192.168.0.8</value>
 <value>192.168.0.9</value>
 <value>192.168.0.10</value>
 <value>192.168.0.11</value>
 <value>192.168.0.12</value>
 <value>192.168.0.13</value>
 <value>192.168.0.14</value>
 <value>192.168.0.15</value>
 <value>192.168.0.16</value>
 </attribute>
 <attribute><name>fullnode.addr</name>
 <value>192.168.0.17</value>
 </attribute>
 <attribute><name>controller.thread.num</name><value>20</value></attribute>
 <attribute><name>agent.thread.num</name><value>20</value></attribute>
 <attribute><name>cmddist.thread.num</name><value>10</value></attribute>
 <attribute><name>local.addr</name><value>192.168.0.2</value></attribute>
 <attribute><name>block.directory</name><value>/home/lixl/ParaRC/blkDir</value></attribute>
 <attribute><name>stripestore.directory</name><value>/home/lixl/ParaRC/stripeStore</value></attribute>
 <attribute><name>tradeoffpoint.directory</name><value>/home/lixl/ParaRC/tradeoffPoint</value></attribute>
 <attribute><name>pararc.mode</name><value>standalone</value></attribute>
 </setting>

For configuration files in agents, please also prepare a configuration file under /home/pararc/ParaRC/conf.

Generate the MLP in the offline

To generate MLP for an (n, k) MSR code, we need to do:

Run GenMLP for n times to generate MLP to repair the n blocks in a stripe in the PRS Generator.
Prepare a MLP file that records the MLPs for future access in the Controller

To generate a MLP, please run the following command in the PRS Generator:

$> ./GenMLP [code] [n] [k] [w] [repairIdx]

code
- The code that we generate MLP for.
- We can set it as Clay for Clay codes.
n
- The erasure coding parameter n
- For example, n=14 for (14,10) Clay code.
k
- The erasure coding parameter k
- For example, k=10 for (14,10) Clay code.
w
- The sub-packetization level.
- For example, w=256 for (14,10) Clay code.
repairIdx
- The index of the block that we repair in a stripe.
- For example, repairIdx = 0 when we generate MLP to repair the first block in a stripe.
Example
- ./GenMLP Clay 14 12 256 0

After running this command, we will get a string, which represents the MLP we generate to repair block with index 0. The string specifies the color of a intermediate vertex that represents a intermediate sub-block or a repaired sub-block.

After we generate n MLPs, we generate a MLP file, which is a .xml file in the tradeoffpoint.directory. The parameters in the MLP file are as follows:

Parameter	Description	Example
code	Type of an erasure code.	Clay
ecn	Erasure coding parameter n.	14
eck	Erasure coding parameter k.	10
ecw	Sub-packetization level.	256
digits	The number of digits that represents a color. For example, as `n = 14`, there are 14 possible colors in total, such that we need to represent each color by a two-digit number, from `0` to `13`.	2
point	A MLP to repair index i is represented as `i:coloring string`, where `i` is the index of a block, and `coloring string` is the coloring result for each intermediate vertex.	0: 0000......1301

Please refer to the sample MLP files we generate for the (14,10) Clay code in /home/pararc/ParaRC/tradeoffPoint .

Generate a stripe of blocks

To generate a stripe of blocks, we need to do:

Run GenData to generate n blocks erasure-coded by a MSR code in the Controller.
Distribute the n blocks to n different agents.
Generate a metadata file to record the metadata information of the stripe in stripeStore.directory.

Here is the command to generate a stripe of blocks:

$> ./GenData [code] [n] [k] [w] [stripeidx] [blockbytes] [subpktbytes]

code
- The name of an erasure code.
- For example, Clay for Clay codes.
n
- The erasure coding parameter n.
- For example, n=14 for (14,10) Clay code.
k
- The erasure coding parameter k.
- For example, k=10 for (14,10) Clay code.
w
- The sub-packetization level
- For example, w=256 for (14,10) Clay code.
stripeidx
- The index of the stripe.
- For example, stripeidx=0 for the first stripe we generate.
blockbytes
- The size of a block in bytes.
- For example, blockbytes=268435456 for the block size of 256 MiB.
subpktbytes
- The size of a sub-packet in bytes.
- For example, subpktbytes=65536 for the sub-packet size of 64 KiB.'

For example, we generate a stripe of (14,10) Clay-coded stripe as follows:

$> ./GenData Clay 14 10 256 0 268435456 65536

Now we have a stripe of blocks in block.directory:

$> ls blkDir
stripe-0-0  stripe-0-1  stripe-0-10  stripe-0-11  stripe-0-12  stripe-0-13  stripe-0-2  stripe-0-3  stripe-0-4  stripe-0-5  stripe-0-6  stripe-0-7  stripe-0-8  stripe-0-9

The name of a block follows the format of stripe-stripeidx-blockidx. For example, stripe-0-9 specifies that this block is in stripe 0, and it is with the block index 9.

Then, we distribute blocks to 14 different agents:

$> for i in {0..13}
> do
> nodeid=`expr $i + 1`
> scp stripe-0-$i agent$nodeid:/home/pararc/ParaRC/blkDir
> done

Now, the 14 agents, from agent1 to agent14, all have a block in block.directory.

Finally, we generate the metadata file Clay-0.xml, meaning that this stripe is erasure-coded by Clay code, and the stripe index is 0. The parameters in a metadata file is as follows:

Parameters	Description	Example
code	The name of the erasure code.	`Clay`
ecn	The erasure coding parameter n.	`14` for (14,10) Clay code.
eck	The erasure coding parameter k.	`10` for (14,10) Clay code.
ecw	The sub-packetization level.	`256` for (14,10) Clay code.
stripename	The name of a stripe.	`Clay-0` in our example.
blocklist	The list of blocks in the stripe, including the name of each block, and IP of corresponding node that stores each block.	`stripe-0-0:192.168.0.3`......
blockbytes	The size of a block in bytes.	268435456 for 256 MIB.
subpktbytes	The size of a sub-packet in bytes.	65536 for 64 KiB sub-packet.

Here is the sample metadata file Clay-0.xml:

 <stripe>
 <attribute><name>code</name><value>Clay</value></attribute>
 <attribute><name>ecn</name><value>14</value></attribute>
 <attribute><name>eck</name><value>10</value></attribute>
 <attribute><name>ecw</name><value>256</value></attribute>
 <attribute><name>stripename</name><value>Clay-0</value></attribute>
 <attribute><name>blocklist</name>
 <value>stripe-0-0:192.168.0.3</value>
 <value>stripe-0-1:192.168.0.4</value>
 <value>stripe-0-2:192.168.0.5</value>
 <value>stripe-0-3:192.168.0.6</value>
 <value>stripe-0-4:192.168.0.7</value>
 <value>stripe-0-5:192.168.0.8</value>
 <value>stripe-0-6:192.168.0.9</value>
 <value>stripe-0-7:192.168.0.10</value>
 <value>stripe-0-8:192.168.0.11</value>
 <value>stripe-0-9:192.168.0.12</value>
 <value>stripe-0-10:192.168.0.13</value>
 <value>stripe-0-11:192.168.0.14</value>
 <value>stripe-0-12:192.168.0.15</value>
 <value>stripe-0-13:192.168.0.16</value>
 <value></value>
 </attribute>
 <attribute><name>blockbytes</name><value>268435456</value></attribute>
 <attribute><name>subpktbytes</name><value>65536</value></attribute>
 </stripe>

Start ParaRC

Run the following command in the controller to start ParaRC.

$> python script/start.py

Test parallel repair

In the new node, we can test the parallel repair to repair a block:

$> ./DistClient degradeRead [blockname] [method]

blockname
- The name of a block
- For example, "stripe-0-0" in our example to repair the first block
method
- dist for parallel repair; conv for conventional centralized repair

For example, we run the following command to test the parallel repair:

$> ./DistClient degradeRead stripe-0-0 dist

Note that to test for different method, please re-start the system to avoid the case that data are cached in each agent.

To test for full-node recovery, please prepare multiple stripes when generate blocks. Then run the following command to repair all the blocks in a failed node:

$> ./DistClient nodeRepair [ip] [code] [method]

ip
- The ip of the failed node
- For example, 192.168.0.3 for agent1
code
- The erasure code for stripes
- For example, Clay
method
- dist for parallel repair; conv for conventional centralized repair

For example, we run the following command to repair agent1

$> ./DistClient nodeRepair 192.168.0.3 Clay dist

HDFS3 integration mode

The HDFS3 integration of ParaRC is built with OpenEC atop Hadoop-3.3.4. Note that only the steps marked with * are different with those in standalone mode. We focus on the steps marked with * in this part.

Prepare a cluster in Alibaba Cloud *
Compile ParaRC
Deploy Hadoop-3.3.4 with OpenEC *
Prepare configuration files for each machine *
Generate the MLP in the offline
Generate a stripe of blocks in Hadoop-3.3.4 *
Start ParaRC
Test parallel repair

Prepare a cluster in Alibaba Cloud

We can use the same cluster we applied in the standalone mode. The following tables shows how we deploy run ParaRC with Hadoop-3.3.4, OpenEC.

ParaRC	OpenEC	Hadoop
PRS Generator	-	-
Controller	Controller	NameNode
Agents	Agents	DataNodes

Deploy Hadoop-3.3.4 with OpenEC

In the source code of ParaRC, we have a patch for OpenEC:

$> ls openec-pararc-patch
hdfs3-integration  openec-patch

hdfs3-integration
- Source code patch and installation scripts for Hadoop-3.3.4
openec-patch
- Source code patch for OpenEC

We first deploy Hadoop-3.3.4 in the cluster, then follows OpenEC.

Build Hadoop-3.3.4 with ParaRC patch

We first download the source code of Hadoop-3.3.4 in /home/pararc/hadoop-3.3.4-src in the NameNode. Then we install the pararc patch with Hadoop-3.3.4:

$> cd /home/pararc/ParaRC/openec-pararc-patch/hdfs3-integration
$> ./install.sh

After running the install.sh, we copy the pararc patch to the source code of Hadoop-3.3.4 and compile the source code of Hadoop-3.3.4.

Configure Hadoop-3.3.4

We follow the document of the configuration for Hadoop-3.0.0 in OpenEC to configure Hadoop-3.3.4. We show the difference here:

As the default username is pararc, please change accordingly in configuration files.
Please use the IPs generated in Alibaba Cloud in your account when configuring Hadoop-3.3.4

In hdfs-site.xml

Parameter	Description	Example
dfs.blocksize	The size of a block in bytes.	268435456 for block size with 256 MiB.
oec.pktsize	The size of a packet in bytes. Note that for a MSR code with sub-packetization level w, the size of a packet is w times the size of a sub-packet.	16777216 for (14,10) Clay code, where w=256, sub-packet size is 64 KiB.

After we generate all the configuration files, please follow the document for Hadoop-3.0.0 provided in OpenEC to deploy Hadoop-3.3.4 in the cluster.

Build OpenEC with ParaRC patch

We first download the source code of OpenEC in /home/pararc/openec in the Controller. Then we install the ParaRC patch with OpenEC.

$> cd /home/pararc/ParaRC/openec-pararc-patch/openec-patch
$> cp -r ./* /home/pararc/openec/src
$> cd /home/pararc/openec/
$> cmake . -DFS TYPE:STRING=HDFS3
$> make

Now we have built the source code of OpenEC with ParaRC patch.

Configure OpenEC

We follow the document of the configuration in OpenEC with the following differences:

ec.policy

Parameter	Description	Example
ecid	Unique id for an erasure code.	clay_14_10
class	Class name of erasure code implementation.	Clay
n	Parameter n for the erasure code.	14
k	Parameter k for the erasure code.	10
w	Sub-packetization level.	256
opt	Optimization level for OpenEC . We do not use optimization in OpenEC.	-1

As the default username is pararc, please change accordingly in configuration files.
Please use the IPs generated in Alibaba Cloud in your account when configuring OpenEC.

After we generate the configuration files, please follow the document in OpenEC to deploy OpenEC in the cluster.

Prepare configuration files for each machine

In the configuration files of ParaRC, we have the following differences compared with the configuration in the standalone mode.

Parameters	Description	Example
block.directory	The directory to store blocks.	`/home/pararc/hadoop-3.3.4-src/hadoop-dist/target/hadoop-3.3.4/dfs/data/current` for hdfs-3 integration mode.

Generate a stripe of blocks in Hadoop-3.3.4

We prepare a script script/gendata.py to generate blocks in Hadoop-3.3.4.

The usage of this script is:

$> python script/gendata.py [nStripes] [code] [n] [k] [w]

nStripes
- The number of stripes to generate.
code
- The name of the erasure code.
n
- The erasure coding parameter n.
k
- The erasure coding parameter k.
w
- The sub-packetization level.

Example

$> python script/gendata.py 1 Clay 14 10 256

After running this script, we generate a stripe of blocks as well as corresponding metadata file.

Finally, we follow the steps in standalone mode to test the parallel repair.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParaRC

Standalone Mode

Prepare a cluster in Alibaba Cloud

Cluster Configuration

Pre-requisite

Compile ParaRC

Prepare configuration files for each machine

Generate the MLP in the offline

Generate a stripe of blocks

Start ParaRC

Test parallel repair

HDFS3 integration mode

Prepare a cluster in Alibaba Cloud

Deploy Hadoop-3.3.4 with OpenEC

Build Hadoop-3.3.4 with ParaRC patch

Configure Hadoop-3.3.4

Build OpenEC with ParaRC patch

Configure OpenEC

Prepare configuration files for each machine

Generate a stripe of blocks in Hadoop-3.3.4

About

Releases 1

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
conf		conf
openec-pararc-patch		openec-pararc-patch
script		script
src		src
stripeStore		stripeStore
tradeoffPoint		tradeoffPoint
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
compile.sh		compile.sh

ukulililixl/ParaRC

Folders and files

Latest commit

History

Repository files navigation

ParaRC

Standalone Mode

Prepare a cluster in Alibaba Cloud

Cluster Configuration

Pre-requisite

Compile ParaRC

Prepare configuration files for each machine

Generate the MLP in the offline

Generate a stripe of blocks

Start ParaRC

Test parallel repair

HDFS3 integration mode

Prepare a cluster in Alibaba Cloud

Deploy Hadoop-3.3.4 with OpenEC

Build Hadoop-3.3.4 with ParaRC patch

Configure Hadoop-3.3.4

Build OpenEC with ParaRC patch

Configure OpenEC

Prepare configuration files for each machine

Generate a stripe of blocks in Hadoop-3.3.4

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages