cSmartML is an automated clustering tool that uses meta-learning and evolutionary algorithms to find the best configurations for clustering a given dataset. Currently limited to numerical datasets, it works for all eight clustering algorithms available on SKlearn and uses parallelization and tools from DEAP to return the list of top configurations found. Evaluation metrics are multi-objective and the best configurations selected with the NSGA-II pareto scheme.
So, let's take a look at how to use the csmartml framework!
git clone https://github.com/DataSystemsGroupUT/CSmartML.git
Based on python 3.6
create an environemt and then install all requirments:
pip install -r requirements.txt
make sure all dependencies are installed! Also, make sure that redis server
is run in background:
sudo systemctl status redis
To start the two servers:
(Navigate to /server/)
python sse.py
(Navigate to /interface/)
npm i
yarn start
There are built-in datasets to test the platform with, else all datasets uploaded must be numerical and in CSV format. No pre-processing of datasets are possible at this time yet.