Most Popular

1500 questions
4
votes
1 answer

COG Annotation - Dealing with genes assigned to two or more COG categories

I am dealing with a list of genes which have been selected from a gene enrichment analysis. In order to see what kind of genes are overrepresented, I ran eggnog-mapper to do an orthology assignment of each gene against a bacterial database (provided…
4
votes
2 answers

Convert Reactome Protein IDs to Pathway IDs?

I have a list of about 50k 'Protein IDs' from Reactome. Is there a simple way to get all the corresponding 'Pathway IDs' for each protein? What is the best service to use? (I'm guessing I can use the Reactome API, but I don't necessarily want to hit…
Dan
  • 612
  • 3
  • 12
4
votes
5 answers

using python to write bioinformatics pipelines tutorial

I was wondering if there is a tutorial or a small code snippet to understand how to write bioinformatics pipeline using python, for example use a aligner (say hisat) get the output and process it using samtools I was able to use subprocess from…
3
votes
2 answers

Does this FASTQ data contain single or paired end calls?

I have this fastq data from GEO: zcat SRR1658526.fastq.gz | head -n 20 @SRR1658526.1 HWI-ST398:296:C1MP4ACXX:1:1101:1093:2094…
0x90
  • 1,437
  • 9
  • 18
3
votes
2 answers

PDB file downloading: pymol automation vs. manual

I automated a PDB download using a Pymol script (below) python pdb_lists = ['3LZG', '6HJN'] # lots more pdbs for x in pdb_lists: cmd.fetch(x) cmd.select(x) cmd.save(x + '.pdb', x) cmd.delete(x) cmd.quit() python end When the script…
M__
  • 12,263
  • 5
  • 28
  • 47
3
votes
1 answer

Adding an attribute to GFF3 file

I failed to add Note=Gene description to mRNA attribute with the below code: #!/usr/bin/python3 import click import gffutils import gffutils.gffwriter as gffwriter @click.command() @click.option('--gff3', help="Provide GFF3 file",…
user977828
  • 453
  • 3
  • 9
3
votes
0 answers

Patient-sample mapping in GSE72056 dataset

I want to use the single-cell data from the following expression profiling which concerns the RNA-seq of metastatic melanoma: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE72056 The data was also analyzed…
hadi
  • 31
  • 2
3
votes
2 answers

How to find common sequences among 6 multi-fasta files

I have 6 multi-fasta files, every of them contains ca 1500 sequences like that: >Haladaptatus sp.…
Patrycja
  • 41
  • 3
3
votes
2 answers

How to transform and sort the matrix to make a heatmap showing signatures?

I have a matrix data with cells as rows and samples as columns. Here I am giving the data with dput dput(data) structure(c(0, 0, 0.0095, 0.0091, 0, 0.0195, 0, 0.006, 0.0023, 0.0035, 0, 0.0306, 0, 0.0859, 0, 0, 0.0059, 0.0229, 0, 5e-04, 0, 0.0116,…
beginner
  • 631
  • 7
  • 15
3
votes
1 answer

Access workdir defined on command line from within Snakefile

Snakemake provides access to a workflow object within a Snakefile. This allows one to, for example, have dynamic programmatic access to the directory containing the Snakefile (via the workflow.basedir attribute). Is there a similar way to access the…
Daniel Standage
  • 5,080
  • 15
  • 50
3
votes
1 answer

Principle of TMT Tags in Multiplex Proteomics

I am new to proteomics research and analyzing mass-spec data. I am trying to learn more about the theory/principle of TMT tagging (a type of isobaric labeling) coupled with LC-MS/MS spectrometry. I understand that each tag has a 1) protein binding…
3
votes
2 answers

How to demultiplex a mix of single-indexed and dual-indexed samples

The problem If I have a sample sheet that contains both single-indexed and dual-indexed samples, I can split it up into two sample sheets and then run bcl2fastq on each one. However, when doing this, large Undetermined fastq files are generated.…
3
votes
0 answers

How do you represent small molecules in SBOL?

In SBOL, how do you represent small molecules like arabinose or chloramphenicol? There's a type for saying a component is a small molecule, but they don't really have sequences like DNA or proteins do, so what's the recommended way to provide…
jakebeal
  • 653
  • 4
  • 16
3
votes
1 answer

Does SBOL require representing intermediate products like mRNA?

Do I need to include intermediate gene products like mRNA in order to express a gene regulatory relationship in SBOL? For example, let's say that I'm representing transcriptional regulation of the pTet promoter by means of a coding sequence that…
jakebeal
  • 653
  • 4
  • 16
3
votes
1 answer

Paired end sequencing with R1 and R2 of different length. Possible?

There are commercial sequencing kits/sequencers that allow for paired end sequencing in which the two reads obtained for each fragment are of different length? For example 150bp for R1 and 50bp for R2? Is it at least possible in principle? There are…
mox
  • 333
  • 2
  • 8