Matches in Nanopublications for { ?s ?p ?o <https://w3id.org/np/RAn2jx_I2zMayleWgTUbhuito99u7BoKWJskrqnJ7dVOk/assertion>. }
- 0000-0003-2388-0744 type Agent assertion.
- Workflow-RO-Crate type CreativeWork assertion.
- about.workflowhub.eu type Organization assertion.
- bts480 type ComputerLanguage assertion.
- enrichment_service-account-enrichment type Agent assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df type ResearchObject assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df type LiveRO assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df type Dataset assertion.
- 17b98cad-68c7-4991-acb1-b922f0c3d44f type Folder assertion.
- 17b98cad-68c7-4991-acb1-b922f0c3d44f type Dataset assertion.
- 28d4cc6a-cc26-4629-865c-951dcec04a63 type Folder assertion.
- 28d4cc6a-cc26-4629-865c-951dcec04a63 type Dataset assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 type Folder assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 type Dataset assertion.
- ef0d1b26-8b1c-4a45-9758-52610cb8e3a8 type Folder assertion.
- ef0d1b26-8b1c-4a45-9758-52610cb8e3a8 type Dataset assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c type Resource assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c type CreativeWork assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c type MediaObject assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 type Resource assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 type MediaObject assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 type ComputationalWorkflow assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 type SoftwareSourceCode assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 type Resource assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 type MediaObject assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 type Resource assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 type MediaObject assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a type Resource assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a type MediaObject assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 type Resource assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 type MediaObject assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa type Resource assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa type MediaObject assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d type Resource assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d type MediaObject assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 type Resource assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 type MediaObject assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 type Resource assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 type MediaObject assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f type Resource assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f type MediaObject assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 type Resource assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 type MediaObject assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 type Resource assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 type MediaObject assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 type Resource assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 type MediaObject assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c type Resource assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c type MediaObject assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 type Resource assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 type MediaObject assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea type Resource assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea type MediaObject assertion.
- ro-crate-metadata.json type CreativeWork assertion.
- 0000-0003-4771-6113 type Person assertion.
- 554 type Person assertion.
- 191 type Organization assertion.
- 191 type Project assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df mainEntity "Snakefile" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df importedBy 0000-0003-2388-0744 assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 sdPublisher "https://about.workflowhub.eu/" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 producer "https://workflowhub.eu/projects/191" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ```" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ``` " assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ```" assertion.
- Workflow-RO-Crate version "0.2.0" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 version "1" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df contentSize "91622645" assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c contentSize "13219" assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 contentSize "83" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 contentSize "562" assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 contentSize "84" assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a contentSize "76" assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 contentSize "91586491" assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa contentSize "7902" assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d contentSize "85" assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 contentSize "86" assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 contentSize "70" assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f contentSize "0" assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 contentSize "84" assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 contentSize "62" assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 contentSize "5859" assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c contentSize "2248" assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 contentSize "2012" assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea contentSize "327" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df dateCreated "2023-09-13 18:24:50.860910+00:00" assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c dateCreated "2023-09-13 18:24:52.667262+00:00" assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 dateCreated "2023-09-13 18:24:52.291405+00:00" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 dateCreated "2023-09-12 19:29:55+00:00" assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 dateCreated "2023-09-13 18:24:52.289390+00:00" assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a dateCreated "2023-09-13 18:24:52.292399+00:00" assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 dateCreated "2023-09-13 18:24:52.663696+00:00" assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa dateCreated "2023-09-13 18:24:52.665086+00:00" assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d dateCreated "2023-09-13 18:24:52.294349+00:00" assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 dateCreated "2023-09-13 18:24:52.295311+00:00" assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 dateCreated "2023-09-13 18:24:52.288353+00:00" assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f dateCreated "2023-09-13 18:24:52.287342+00:00" assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 dateCreated "2023-09-13 18:24:52.293394+00:00" assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 dateCreated "2023-09-13 18:24:52.290394+00:00" assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 dateCreated "2023-09-13 18:24:52.665788+00:00" assertion.