This repository has been archived by the owner on Jan 3, 2023. It is now read-only.
SSM v1.4.0
Highlights:
- Small File Solution. SSM compacts batch of small files into one big file stored in HDFS, through this saves memory for managing block info and improves the Namenode scalability, at the same time optimizes the small file read performance.
- Disaster Recovery Solution with S3 support, files now can be synced to S3 cluster. [Experimental]
- Refined SSM high-availability.
- Functional enhancements:
- Add one-shot rule support
- Add file relative-temperature support
- Add new actions (sleep, trancate0, alldisk, onedisk, ramdisk, copy2s3)
- Add new properties to define rule (isDir, acTop, acTopSp, acBot, acBotSp, storage capacity/free/utilization)
- Dynamic add new SSM Agents support
- Add throughput throttling for Disaster Recovery Solution
- UI enhancements for monitoring Metastore utilization, submitting rule/cmdlet, displaying resource ...
- Historical cmdlet/file access info/file diff info purging
- Configurable JVM parameters for SSM server/agents
- Defer cmdlet execution support
- Performance optimizations:
- Concurrent Namespace fetcher
- Concurrent cmdlet dispatcher
- Optimized meta store access.
- Optimizations for handling large HDFS namespace
- Optimize cmdlet status report
- Fine-grained locks for many shared resources
- Load balance for cmdlet execution
- Refined SSM documents
- Add a lot of scripts for SSM functional and performance tests
Change log:
- #1442, Fix and refactor CopyScheduler failover (#1826)
- #1820, Fix reporting file rename action success but actually failed bug (#1821)
- #1791, Fix destination path bug in copy scheduler (#1792)
- #1787, Fix timing bug for schedule-failed cmdlets (#1788)
- Avoid NPE when tackling timeout action (#1780)
- #1701, Add one-shot rule support
- #1681, Add data throttling for file copy action
- #1678, Fix delete failure during DFSIO (#1741)
- #1728, Dispatch cmdlet to given node with free slot (#1729)
- #1721, Fix namespace mismatch caused by unlink (#1717)
- #1707, Refine handling of AT-trigger rule (#1708)
- #1692, Defer cmdlet execution support (#1696)
- #1688, Downgrade a error log to debug in CopyScheduler.baseSync (#1695)
- #1653, Fix cmdlet generation issues when stopping rule (#1654)
- Avoid NPE for inferCmdletStatus and batchsync actions (#1648)
- #1605, Delete unfinished cmdlet and action when a rule is disabled (#1607)
- #1636, Avoid null pointer exception for mapStorageCapacity (#1638)
- #1624, Synchronization issue on mapStorageCapacity (#1625)
- #1617, Fix namespace fetching heap memory usage issue (#1618)
- #1615, Fix bug in RPC API getActionInfo (#1616)
- #1608, Fix file mover OOM issue (#1610)
- #1595, Fix use the correct rpc server address to create SmartDFSClient
- #1176, Show action execution result under submission area. (#1592)
- #1524 and #1563, tune Performance on large namespace (#1566)
- #1568, Create SSM Id file /system/ssm.id in HDFS (#1569)
- Fix file_state key length on mysql 5.6 or older (#1551)
- #1546, Testing feature: Skip fetch entire HDFS namespace and update based on iNotify only
- Format the database only when all necessary tables don't exist
- #1531, Adjust storage utilization UI page
- #1519, Fix dest path issue in copy2s3 related rule (#1521)
- Fix action args column type and catch launchCmdlet exception.
- #1517, Fix fake data generation bug (#1518)
- #1504, Fix long-run action state update issue (#1506)
- #1480, Add statistic info for dispatcher
- #1499, Fix memory exhaust due to too many pending cmdlets (#1500)
- #1478, Fix cmdlet dispatcher performance issue (#1479)
- Fix the notebook bugs. (#1465)