Matches in Nanopublications for { ?s ?p ?o <https://w3id.org/np/RAuZdwi5v34TP1MWsdSxlqcFFrNgAsnNfPkF_bF-PjuGA/assertion>. }
- 11f3e069-8d1d-48da-876f-52fd6d255223 type ComputationalWorkflow assertion.
- bts480 type ComputerLanguage assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 type SoftwareSourceCode assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df type ResearchObject assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df type LiveRO assertion.
- 17b98cad-68c7-4991-acb1-b922f0c3d44f type Folder assertion.
- 28d4cc6a-cc26-4629-865c-951dcec04a63 type Folder assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 type Folder assertion.
- ef0d1b26-8b1c-4a45-9758-52610cb8e3a8 type Folder assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c type Resource assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 type Resource assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 type Resource assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 type Resource assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a type Resource assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 type Resource assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa type Resource assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d type Resource assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 type Resource assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 type Resource assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f type Resource assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 type Resource assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 type Resource assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 type Resource assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c type Resource assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 type Resource assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea type Resource assertion.
- 0000-0003-4771-6113 type Person assertion.
- 554 type Person assertion.
- Workflow-RO-Crate type CreativeWork assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c type CreativeWork assertion.
- ro-crate-metadata.json type CreativeWork assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df type Dataset assertion.
- 17b98cad-68c7-4991-acb1-b922f0c3d44f type Dataset assertion.
- 28d4cc6a-cc26-4629-865c-951dcec04a63 type Dataset assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 type Dataset assertion.
- ef0d1b26-8b1c-4a45-9758-52610cb8e3a8 type Dataset assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c type MediaObject assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 type MediaObject assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 type MediaObject assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 type MediaObject assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a type MediaObject assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 type MediaObject assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa type MediaObject assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d type MediaObject assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 type MediaObject assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 type MediaObject assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f type MediaObject assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 type MediaObject assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 type MediaObject assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 type MediaObject assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c type MediaObject assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 type MediaObject assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea type MediaObject assertion.
- about.workflowhub.eu type Organization assertion.
- 191 type Organization assertion.
- 191 type Project assertion.
- 0000-0003-2388-0744 type Agent assertion.
- enrichment_service-account-enrichment type Agent assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df mainEntity "Snakefile" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df importedBy 0000-0003-2388-0744 assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 sdPublisher "https://about.workflowhub.eu/" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 producer "https://workflowhub.eu/projects/191" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ```" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ```" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ``` " assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 version "1" assertion.
- Workflow-RO-Crate version "0.2.0" assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f contentSize "0" assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 contentSize "62" assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 contentSize "70" assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a contentSize "76" assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 contentSize "83" assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 contentSize "84" assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 contentSize "84" assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d contentSize "85" assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 contentSize "86" assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea contentSize "327" assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 contentSize "2012" assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c contentSize "2248" assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 contentSize "5859" assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa contentSize "7902" assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c contentSize "13219" assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 contentSize "91586491" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df contentSize "91622645" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 contentSize "562" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 dateCreated "2023-09-12 19:29:55+00:00" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df dateCreated "2023-09-13 18:24:50.860910+00:00" assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c dateCreated "2023-09-13 18:24:52.282962+00:00" assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea dateCreated "2023-09-13 18:24:52.285895+00:00" assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f dateCreated "2023-09-13 18:24:52.287342+00:00" assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 dateCreated "2023-09-13 18:24:52.288353+00:00" assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 dateCreated "2023-09-13 18:24:52.289390+00:00" assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 dateCreated "2023-09-13 18:24:52.290394+00:00" assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 dateCreated "2023-09-13 18:24:52.291405+00:00" assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a dateCreated "2023-09-13 18:24:52.292399+00:00" assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 dateCreated "2023-09-13 18:24:52.293394+00:00" assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d dateCreated "2023-09-13 18:24:52.294349+00:00" assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 dateCreated "2023-09-13 18:24:52.295311+00:00" assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 dateCreated "2023-09-13 18:24:52.663696+00:00" assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa dateCreated "2023-09-13 18:24:52.665086+00:00" assertion.