-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about wget dataset results #13
Comments
|
I wonder if it's possible that my environment is not the same as yours, both of which I'm executing in the following environment |
I have tried to evaluate wget under your environment setting (pytorch==2.1.0 and dgl==2.0.0). I'm getting the same results as using dgl==1.0.0 both with and without the pre-trained pkls. |
Did you obtain the graphs.pkl from parsing the raw logs or from the pkl provided by MAGIC? |
What is your k (i.e. num_neighbors)? Using k == 1 on wget dataset could be the cause. |
The "zero-dimension" error is simply a bug. Modifying |
Yes. I'm getting normal evaluation results when k == 2 but results like yours when k == 1. |
If your graphs.pkl is not the provided one, make sure that node type in index 2 is 'task'. |
Does your version of graphs.pkl matches the size of the provided one? If not, what is your data source? |
Is it possible that the order of the raw logs is different, which results in incorrect labeling during loaddata and triggers the shift in node type indices as a byproduct? |
I just tried the indexes 0-7, and it still didn't work well. I'll start with downloading the data in the afternoon and try building it again. It's a really strange problem. |
I forgot if attack logs should be the first 25 or the last 25 logs to be parsed, but this absolutely matters. |
@jiangdie666 @Jimmyokok I have another question. If I'm not wrong, in the original paper the results for wget were reported as follows: I have done the Quick Evaluation and got the following results: [n_graph, n_node_feat, n_edge_feat]: [150, 8, 4] I also saw that the last results @jiangdie666 shared were close to mine. What might be the reason for the different results for Precision, F1, and AUC? |
I have rerun the Quick Evaluation with exactly the same data, checkpoints and code as in this repository, which gives me this: |
With seed 2022, which aligns with the repository code, I'm getting this: |
尝试一下多个种子平均?说不定2022在其他设备上正好表现非常差? |
用项目自带的 graghs.zip解压后的图处理文件 graphs.pkl结果是满足的,那就说明依旧是预处理wget的数据代码的部分有小bug。 |
Hello, thank you for pointing out the issues in the thread. I have corrected the file name issue and now the first 25 files processed are attack logs then next 125 files are normal logs. I am using the dependencies mentioned in the repo and k=2. For wget dataset, the graph.pkl and checkpoint present in the repo give me the following results: [n_graph, n_node_feat, n_edge_feat]: [150, 8, 4] However, when parse the raw logs, then train and evaluate, I get the following results (more similar to @SaraDadjouy): [n_graph, n_node_feat, n_edge_feat]: [150, 8, 4] Would appreciate any guidance on what might be causing the difference in metrics. I have also tried using different seeds and am not getting better results. |
@m-shayan73 I repeated my data processing again from scratch, and come to notice that the for loop at line 790 of wget_parser.py is not processing files in the desired order (first 25 being attack logs then next 125 files being normal logs). After sorting the filenames, the resulting new graphs.pkl gives me: If re-train the model, the result becomes: You mentioned that you 'have corrected the file name issue and now the first 25 files processed are attack logs then next 125 files are normal logs', and still get unsatisfactory results. Does that mean you have already observed and corrected the above issue, and the problem still exists? |
I will try your code then. My version simply sorts the filenames based on string order, which may be different from yours in the indices of node and edge types. |
@m-shayan73 I'm able to reproduce your result. However, its a seed issue. Measuring average AUC on 100 random seeds gives me AUC~0.9450 under both file orders. I have also found a trick that leads to improvement on the wget detection AUC (~0.9450 to ~0.9650), which I have just updated through the '1.0.6' commit. |
Thanks a lot for the help. It was indeed a seed issue, I changed the seed in one file (./eval.py) but it was again being changed in another file (./model/eval.py). The results are better especially with the updates in 1.0.6. Just to confirm, the change made is that instead of just using the task/index 2 nodes we are now pool over 5 different node types? Is there a reason to select these 5 node types instead of all 8/other combinations? |
The reason is that when I tested the full combination yesterday, it yields nan values in prediction results. I have just fixed the bug related to it and now the result is AUC~0.9750. |
I have evaluated the original dataset using your project's code to train the generated pkl, and also the trained pkl that comes with your project, respectively. But the results are not satisfactory, is it because I didn't set other parameter details.
Raw data evaluation results from my own training
This is the pkl that comes with your project.
The text was updated successfully, but these errors were encountered: