Matches in Nanopublications for { <https://w3id.org/ro-id/3fdc0374-95f4-4c7d-928c-24dd80fbd26f/> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f type ResearchObject assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f type LiveRO assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f type Dataset assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f mainEntity "workflow/Snakefile" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f importedBy 0000-0003-2388-0744 assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f creativeWorkStatus "Work-in-progress" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f description 3fdc0374-95f4-4c7d-928c-24dd80fbd26f assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f description "# prepareChIPs This is a simple `snakemake` workflow template for preparing **single-end** ChIP-Seq data. The steps implemented are: 1. Download raw fastq files from SRA 2. Trim and Filter raw fastq files using `AdapterRemoval` 3. Align to the supplied genome using `bowtie2` 4. Deduplicate Alignments using `Picard MarkDuplicates` 5. Call Macs2 Peaks using `macs2` A pdf of the rulegraph is available [here](workflow/rules/rulegraph.pdf) Full details for each step are given below. Any additional parameters for tools can be specified using `config/config.yml`, along with many of the requisite paths To run the workflow with default settings, simply run as follows (after editing `config/samples.tsv`) ```bash snakemake --use-conda --cores 16 ``` If running on an HPC cluster, a snakemake profile will required for submission to the queueing system and appropriate resource allocation. Please discuss this will your HPC support team. Nodes may also have restricted internet access and rules which download files may not work on many HPCs. Please see below or discuss this with your support team Whilst no snakemake wrappers are explicitly used in this workflow, the underlying scripts are utilised where possible to minimise any issues with HPC clusters with restrictions on internet access. These scripts are based on `v1.31.1` of the snakemake wrappers ### Important Note Regarding OSX Systems It should be noted that this workflow is **currently incompatible with OSX-based systems**. There are two unsolved issues 1. `fasterq-dump` has a bug which is specific to conda environments. This has been updated in v3.0.3 but this patch has not yet been made available to conda environments for OSX. Please check [here](https://anaconda.org/bioconda/sra-tools) to see if this has been updated. 2. The following error appears in some OSX-based R sessions, in a system-dependent manner: ``` Error in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : polygon edge not found ``` The fix for this bug is currently unknown ## Download Raw Data ### Outline The file `samples.tsv` is used to specify all steps for this workflow. This file must contain the columns: `accession`, `target`, `treatment` and `input` 1. `accession` must be an SRA accession. Only single-end data is currently supported by this workflow 2. `target` defines the ChIP target. All files common to a target and treatment will be used to generate summarised coverage in bigWig Files 3. `treatment` defines the treatment group each file belongs to. If only one treatment exists, simply use the value 'control' or similar for every file 4. `input` should contain the accession for the relevant input sample. These will only be downloaded once. Valid input samples are *required* for this workflow As some HPCs restrict internet access for submitted jobs, *it may be prudent to run the initial rules in an interactive session* if at all possible. This can be performed using the following (with 2 cores provided as an example) ```bash snakemake --use-conda --until get_fastq --cores 2 ``` ### Outputs - Downloaded files will be gzipped and written to `data/fastq/raw`. - `FastQC` and `MultiQC` will also be run, with output in `docs/qc/raw` Both of these directories are able to be specified as relative paths in `config.yml` ## Read Filtering ### Outline Read trimming is performed using [AdapterRemoval](https://adapterremoval.readthedocs.io/en/stable/). Default settings are customisable using config.yml, with the defaults set to discard reads shorter than 50nt, and to trim using quality scores with a threshold of Q30. ### Outputs - Trimmed fastq.gz files will be written to `data/fastq/trimmed` - `FastQC` and `MultiQC` will also be run, with output in `docs/qc/trimmed` - AdapterRemoval 'settings' files will be written to `output/adapterremoval` ## Alignments ### Outline Alignment is performed using [`bowtie2`](https://bowtie-bio.sourceforge.net/bowtie2/manual.shtml) and it is assumed that this index is available before running this workflow. The path and prefix must be provided using config.yml This index will also be used to produce the file `chrom.sizes` which is essential for conversion of bedGraph files to the more efficient bigWig files. ### Outputs - Alignments will be written to `data/aligned` - `bowtie2` log files will be written to `output/bowtie2` (not the conenvtional log directory) - The file `chrom.sizes` will be written to `output/annotations` Both sorted and the original unsorted alignments will be returned. However, the unsorted alignments are marked with `temp()` and can be deleted using ```bash snakemake --delete-temp-output --cores 1 ``` ## Deduplication ### Outline Deduplication is performed using [MarkDuplicates](https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard-) from the Picard set of tools. By default, deduplication will remove the duplicates from the set of alignments. All resultant bam files will be sorted and indexed. ### Outputs - Deduplicated alignments are written to `data/deduplicated` and are indexed - DuplicationMetrics files are written to `output/markDuplicates` ## Peak Calling ### Outline This is performed using [`macs2 callpeak`](https://pypi.org/project/MACS2/). - Peak calling will be performed on: a. each sample individually, and b. merged samples for those sharing a common ChIP target and treatment group. - Coverage bigWig files for each individual sample are produced using CPM values (i.e. Signal Per Million Reads, SPMR) - For all combinations of target and treatment coverage bigWig files are also produced, along with fold-enrichment bigWig files ### Outputs - Individual outputs are written to `output/macs2/{accession}` + Peaks are written in `narrowPeak` format along with `summits.bed` + bedGraph files are automatically converted to bigWig files, and the originals are marked with `temp()` for subsequent deletion + callpeak log files are also added to this directory - Merged outputs are written to `output/macs2/{target}/` + bedGraph Files are also converted to bigWig and marked with `temp()` + Fold-Enrichment bigWig files are also created with the original bedGraph files marked with `temp()` " assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f contentSize "75099" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f dateCreated "2023-09-08 11:58:23.205668+00:00" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Domain "https://w3id.org/ro-id/c55159f8-5154-4d92-b6c8-ebd2fcfc4bf0" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Domain "https://w3id.org/ro-id/af02653a-41cb-4cbc-86f3-4da526e29d72" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Domain "https://w3id.org/ro-id/a1bb094f-e223-4eb7-90c8-2d55ec57e261" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Lemma "https://w3id.org/ro-id/1efe80d0-1f70-4a71-8b23-3e1de8d01a44" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Lemma "https://w3id.org/ro-id/3d3af63c-e9de-41b9-8795-881508422b9f" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Lemma "https://w3id.org/ro-id/6b05e40d-078b-4023-b984-3dd996317ee9" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Lemma "https://w3id.org/ro-id/d98db78b-77a7-454e-af9d-4ef2514949dc" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Lemma "https://w3id.org/ro-id/9156e762-e630-40b0-b9aa-c059ba0a40c1" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Lemma "https://w3id.org/ro-id/5bf0c07d-8b93-4a78-b020-101fa59fd22a" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Lemma "https://w3id.org/ro-id/dba77b48-6599-4bbd-82d2-f9d4cfd9abd8" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f cite-as "Stevie Pederson. "Research Object Crate for prepareChIPs:." ROHub. Sep 08 ,2023. https://w3id.org/ro-id/3fdc0374-95f4-4c7d-928c-24dd80fbd26f." assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/b1a61304-f050-4541-9d32-259ff31f5fdc" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/d35c78c4-748a-4706-a8b1-cee0aa47f9ab" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/d80380ce-e484-491a-b68b-e978fd2930c2" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/ff38c043-b00d-41e1-b3e0-4f9aa80de812" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/3c10da67-cce5-459d-88dd-2bce722516d0" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/6277e661-77ff-4bd7-b1b3-f2f51c0ab0ec" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/dfdcee9f-16ae-4a8f-bae0-4b0b69a73d53" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/f7b300ce-ad48-4976-9503-40223a6a9144" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Concept "https://w3id.org/ro-id/8e9e8062-ddbd-46c3-a771-5be65444212f" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f FieldOfResearch "https://w3id.org/ro-id/64f93c98-c150-4431-885a-8ac5fb5add82" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f FieldOfResearch "https://w3id.org/ro-id/28fa92df-fcc2-479d-b1db-71bdf50be326" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f FieldOfResearch "https://w3id.org/ro-id/46d241be-9337-4749-9543-e01808c828ce" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f FieldOfResearch "https://w3id.org/ro-id/0f19c8c2-dbdb-4207-b24c-c97de87452ac" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/708386a0-6582-4493-a7a2-7e22bc2c0dbb" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/61a7eb88-5e9f-4cfc-b578-c406b7a43144" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/e08487bf-fd3c-4882-90b4-5f9ce844a146" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/c33311df-7f79-464a-98d9-4ca351c53eb6" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/a6991c9d-c113-435e-9455-dfbbd1f0ed19" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/2abd0462-b126-4e65-beed-dd55db1ab4b4" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/339f5ed4-2f45-49a7-926d-a2b248657c9b" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f IPTC "https://w3id.org/ro-id/b827efbe-6c40-47b1-bba2-b9623dc2ed2f" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f NASA "https://w3id.org/ro-id/49f24138-b352-4b9d-bbcd-b10c266846ca" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f NASA "https://w3id.org/ro-id/c9a9f204-3830-48be-a7a8-9f4e19d0fd00" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f NASA "https://w3id.org/ro-id/4366f40c-5956-4a88-87bf-794b7c12d33b" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f NASA "https://w3id.org/ro-id/c65c143c-5478-46a4-a257-4ec365270d90" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/ccf5e4c4-0c4e-48dd-8bdd-84006c60ec4b" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/b673a29e-23c3-487e-ad95-8fc69f7d7547" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/faf7e5b3-b0ef-4e09-b55c-a745db352edf" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/e193bcdc-d8c4-4b4e-8331-c1d46c14c206" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/52e0bc84-491c-4913-9a92-5fe4af038b25" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/113a166f-2c22-4c99-a811-eb9943175d7b" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/a49165ba-fbd5-4483-9c39-9cb8258e6c5a" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Phrase "https://w3id.org/ro-id/69951488-193c-43c5-ae9a-c5135803cb3e" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Sentence "https://w3id.org/ro-id/13a2600e-aeba-4334-9fd1-abb8f2542e75" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Sentence "https://w3id.org/ro-id/0d2c2af7-0ab6-4218-885f-a838c95f4871" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Sentence "https://w3id.org/ro-id/5cf3d332-4cd5-432a-8423-200792b8d509" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Sentence "https://w3id.org/ro-id/78fbbea2-7dd0-4f4b-984d-625c6013c161" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f Sentence "https://w3id.org/ro-id/0fc5108a-3620-4c1b-94d3-a22e717bd424" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f url "https://workflowhub.eu/workflows/528/ro_crate?version=1" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f name "Research Object Crate for prepareChIPs:" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f contentUrl "https://api.rohub.org/api/ros/3fdc0374-95f4-4c7d-928c-24dd80fbd26f/crate/download/" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f creator 0000-0003-2388-0744 assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f dateModified "2024-03-05 12:23:16.330026+00:00" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f datePublished "2023-09-08 11:58:23.205668+00:00" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f encodingFormat "application/ld+json" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart 2a0f5a70-7132-4851-8b07-d08c25f21ffe assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart 6816861d-1552-4acb-bce5-5a7392b92880 assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart 9511483c-1216-4050-acdc-a6b331ce81c7 assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart c16b77ee-3113-4dca-ab0a-e871e51008c0 assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart 03bab0c3-27a1-4bad-9192-54fd5718efb6 assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart 165ceeeb-0982-40e6-a0ad-a77d0eec7316 assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart 17e5cc71-3d5c-46fd-8a27-709516f324bd assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f hasPart 44eaf315-663c-450e-b246-b417fb1251ba assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f identifier "https://w3id.org/ro-id/3fdc0374-95f4-4c7d-928c-24dd80fbd26f" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f license no-permission assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f isBasedOn "https://github.com/smped/prepareChIPs.git" assertion.
- 3fdc0374-95f4-4c7d-928c-24dd80fbd26f author 0000-0001-8197-3303 assertion.