Most Popular
1500 questions
3
votes
2 answers
Detect transcript isoform abundance for a specific gene in scRNA-seq
I want to detect the count of isoform transcripts for a specific gene in scRNA-seq data. Data is coming from cells of Mus Musculus.
For transcript isoforms I mean the different alternatives provided by known annotation. For each one I need the total…
gc5
- 1,783
- 18
- 32
3
votes
1 answer
Database of position weight matrices for protein motifs?
I am trying to identify proteins that carry a consensus target sequence for a kinase. Usually if I am working with, for e.g. transcription factors binding to DNA I would use position weight matrices (PWMs) from one of a number of databases of PWMs…
Ian Sudbery
- 3,311
- 1
- 11
- 21
3
votes
1 answer
Aligning ChIP-Seq reads to repeats for downstream peak analysis
This is a brief question regarding the above. I have previously used bowtie to map reads from paired-end ChIP-Seq sequencing, then used the positions for peak-calling. However, I'm trying to do similar with a dataset which may consist of repeats (I…
user36196
- 291
- 1
- 6
3
votes
2 answers
How to find/build the evolutionary history of a protein from its sequence?
I'd like to build the evolutionary history of a protein, given its sequence. Namely, given a FASTA entry how can I build an evolutionary tree? Here is the 5wxy protein as an…
0x90
- 1,437
- 9
- 18
3
votes
1 answer
Removing repeated reads from nanopore 1D² reads
This is summary of discussions and a question that was posted by Devon O'Rourke on Twitter
Following Albacore basecalls on a 1D² library I get two sets of .fq files, summary stats, etc.: one for the 1D basecalling script, and one for the 1D² script.…
gringer
- 14,012
- 5
- 23
- 79
3
votes
1 answer
predict the foldability of single-stranded DNA molecules
I have a list of regions of the human genome and I want to predict if single-stranded molecules in a buffer would tend to fold and create pin structures by sequence self-complementarity. What's the most precise software/parameter set I can use to…
719016
- 2,324
- 13
- 19
3
votes
2 answers
Is there a tool that can take a protein's amino acid sequence and would display it's locus on the genome?
I have the UNIPROT IDs, PDB IDs and FASTA files of several known proteins. I am looking for a tool that can take as input the protein's amino acid sequence and display the coding nucleotides of those amino acids. Moreover, I look for a tool that…
Adrian Smith
- 357
- 1
- 7
3
votes
3 answers
How to demultiplex pair-end fastq reads with barcode 2 in the identifier line?
I have multiplexed pair-end fastq reads with dual barcodes. The issue is that one barcode is present in the header and one is present at the beginning of the read. I need a method to demultiplex this data, but in order to assign a read to an…
Caleb Benson
- 31
- 2
3
votes
2 answers
Where can I find gene expression data on one of the cell lines in NCI-60?
I'm trying to retrieve the most expressed genes from a cell line from the NCI-60, lets say cell line NCI-H23, but I'm not sure where to find the gene expression data. My main goal is to create a network using that data and iRefIndex to be loaded…
user1171426
- 109
- 2
3
votes
3 answers
Which reference to use for read mapping for popular model organisms
What is the "best" assembly for the popular model organisms:
human (GRCh37 and GRCh38 are obvious, I'd pick whatever bwakit uses)
mouse (GRCm37/GRCm38, OK)
but what about non-human/mouse ones?
fruit fly
zebrafish
E. coli
any other idea?
Manuel
- 588
- 4
- 5
3
votes
2 answers
How to download the whole BLAST nt database into a specific folder?
I have successfully downloaded the whole nt BLAST database into the current folder using:
wget -b "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.??.tar.gz"
However, I would like to download it in a specific folder, e.g. output/
I have tried:
wget -b…
Biomagician
- 2,459
- 16
- 30
3
votes
0 answers
Can second search with limited search space improve result in MS proteomics data?
First, please know that I am a novice in this field and that it is thus very likely that you will regard this question as basic or flat stupid. Please be gentle and don't be afraid to criticize. I am asking this so I can learn. Also, don't feel…
Max Jonatan Karlsson
- 193
- 1
- 6
3
votes
1 answer
getting KeyError on certain residues in BioPython: ATOM vs. HETATM lines
When I want to load atoms for residuum (' ', 85, ' ') in chain A for protein 3tmm with structure[0][chain_id][residue], but I'm getting KeyError exception:
Traceback (most recent call last):
[...]
File "/home/pok/Checkouts/bilkoviny/loader.py",…
jhutar
- 165
- 3
3
votes
2 answers
Is there anything similar to GSEA for locus-based (instead of of gene-based) data?
As the question states, I am interested in an analysis similar to Gene Set Enrichment Analysis (ranked gene sets) but focused on locus-level data instead of genes.
To explain in greater detail: I have a set of genomic coordinates from DNA…
Reilstein
- 367
- 1
- 14
3
votes
1 answer
Normalization using parallel R script
I need to implement a parallel version of an R script because on one core it takes forever to execute. The function that takes the most of time is scran::quickCluster() applied to an integer sparse matrix (UMI_count) that is 26000x66000. The code is…
Nikita Vlasenko
- 2,558
- 3
- 26
- 38