You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was suggesting a new category of Data Hazard called "Malicious Combo", which has the potential to impact many facets of life. This Data Hazard Category is what is commonly used in OSINT - Open source intelligence, meaning that data is freely available on the internet that does not cause any harm on its own but can be combined with other non-malicious data to cause severe harm.
This has happened commonly before in terms of the mixing of chemicals. For example, it is a quite common house cleaning tip not to mix bleach and vinegar - which creates Chlorine gas from very common household items. This kind of example has been done in many instances by harmful agents for malicious activities.
In data science, anyone's social media posts can be scrapped to form decisions that might impact people. For example, if there is a post on social media about pets and travel during vacations, this data can be scrapped to predict and maliciously restrict insurance, credit and much more. This is very similar to predictive targeted advertising but just reverse-engineered to be used for the wrong purposes. This kind of issue can be used in terms of economic, political and military purposes. Another example: There have been different studies on politics using Reddit as a source (scrapped), and most of these datasets are available publicly. Polarisation studies have been done on Reddit before and it is quite viable with the right set of knowledge to push malicious bots to provoke and destabilise society through misinformation. There are several examples like this in various walks of life.
The main idea here is that the data which is publicly available is not harmful but the right combination can make damage. This is to see if a data set created can lead to this.
Reference: https://doi.org/10.5604/01.3001.0012.1474
Ethical considerations before what data sets are publicly available and considering which data sets are just a few steps away from being misused (not direct, but indirect harm) should be considered. Decentralising data is also good practice so that not everything is readily available in a centralised place. Combining currently existing data hazard warnings to predict and see what outcome can lead to this hazard is also a viable method.
The text was updated successfully, but these errors were encountered:
Mukilan wrote:
I was suggesting a new category of Data Hazard called "Malicious Combo", which has the potential to impact many facets of life. This Data Hazard Category is what is commonly used in OSINT - Open source intelligence, meaning that data is freely available on the internet that does not cause any harm on its own but can be combined with other non-malicious data to cause severe harm.
This has happened commonly before in terms of the mixing of chemicals. For example, it is a quite common house cleaning tip not to mix bleach and vinegar - which creates Chlorine gas from very common household items. This kind of example has been done in many instances by harmful agents for malicious activities.
In data science, anyone's social media posts can be scrapped to form decisions that might impact people. For example, if there is a post on social media about pets and travel during vacations, this data can be scrapped to predict and maliciously restrict insurance, credit and much more. This is very similar to predictive targeted advertising but just reverse-engineered to be used for the wrong purposes. This kind of issue can be used in terms of economic, political and military purposes. Another example: There have been different studies on politics using Reddit as a source (scrapped), and most of these datasets are available publicly. Polarisation studies have been done on Reddit before and it is quite viable with the right set of knowledge to push malicious bots to provoke and destabilise society through misinformation. There are several examples like this in various walks of life.
The main idea here is that the data which is publicly available is not harmful but the right combination can make damage. This is to see if a data set created can lead to this.
Reference: https://doi.org/10.5604/01.3001.0012.1474
Ethical considerations before what data sets are publicly available and considering which data sets are just a few steps away from being misused (not direct, but indirect harm) should be considered. Decentralising data is also good practice so that not everything is readily available in a centralised place. Combining currently existing data hazard warnings to predict and see what outcome can lead to this hazard is also a viable method.
The text was updated successfully, but these errors were encountered: