2024-04-17
PAMREIN's daily Open Notebook (COMMONS Lab)
Check Github
Todo --[]
Meetings
Daily report (What did I learn?)
Change from website to ssh (for this you need a ssh key and also saved it on github):
- git remote -w # check if remote is set to https
- git remote set-url origin
Further analysisi of ./MINE-Database/mine_database/data/metacyc_rules/ metacyc_intermediate_rules.tsv - 7359 lines metacyc_generalized_rules.tsv - 1225 lines
comm -12 metacyc_generalized_rules_sorted.tsv metacyc_intermediate_rules_sorted.tsv | wc
426 852 8896
comm -31 metacyc_generalized_rules_sorted.tsv metacyc_intermediate_rules_sorted.tsv | wc
6933 13866 171744
comm -23 metacyc_generalized_rules_sorted.tsv metacyc_intermediate_rules_sorted.tsv | wc
799 1598 16695
comm -3 metacyc_generalized_rules_sorted.tsv metacyc_intermediate_rules_sorted.tsv | wc
7732 15464 195372
-1 suppress column 1 (lines unique to FILE1) -2 suppress column 2 (lines unique to FILE2) -3 suppress column 3 (lines that appear in both files)
unique in generalized rules: 799 unique in intermediate rules: 6933 unique entries total: 7732 found in both files: 426
To check further, I asked for the rulename and the "reactioncomment":
- Because it had some rulenames with underscore, I was wondering what this could be
cut -f1,5 metacyc_intermediate_rules.tsv | grep -v -E "[a-z]*_[a-zA-Z0-9]*" | wc
He found 91 lines with this rulename and following structure (example):
rule0945 metacyc:RXN-17975;metacyc:RXN-17988
rule0906 kegg:R07039;kegg:R07043;metacyc:RXN-14948;metacyc:RXN-14952
rule0323 RXN-17679;RXN-527;RXN-7686;RXN-7687;RXN-8379;RXN-8450;RXN1F-93
rule1260 N-ACETYLHEXOSAMINE-1-DEHYDROGENASE-RXN
It looks like, that almost all the time "RXN" is involved. To check that:
cut -f1,5 metacyc_intermediate_rules.tsv | grep ":RXN-" | wc*" | wc
# 3598 lines
cut -f1,5 metacyc_intermediate_rules.tsv | grep "metacyc:RXN-" | wc
# 3598 lines
cut -f1,5 metacyc_intermediate_rules.tsv | grep "RXN" | wc
# 5291 lines
Other Databases are followed distributed:
(exp.: cut -f1,5 metacyc_intermediate_rules.tsv | grep "kegg" | grep "metacyc" -c
)
Kegg: 4198 Brenda: 5656 metacyc: 5258 Kegg & Brenda: 3250 Kegg & metacyc: 3783 Brenda & metacyc: 3756 Brenda & metacyc & Kegg: 3002