Highest Voted Questions - Bioinformatics Stack Exchange

3

votes

2 answers

Detect transcript isoform abundance for a specific gene in scRNA-seq

I want to detect the count of isoform transcripts for a specific gene in scRNA-seq data. Data is coming from cells of Mus Musculus. For transcript isoforms I mean the different alternatives provided by known annotation. For each one I need the total…

asked Jan 28 '18 at 18:08

gc5

1,783
18
32

3

votes

1 answer

Database of position weight matrices for protein motifs?

I am trying to identify proteins that carry a consensus target sequence for a kinase. Usually if I am working with, for e.g. transcription factors binding to DNA I would use position weight matrices (PWMs) from one of a number of databases of PWMs…

asked Jan 25 '18 at 13:23

Ian Sudbery

3,311
1
11
21

3

votes

1 answer

Aligning ChIP-Seq reads to repeats for downstream peak analysis

This is a brief question regarding the above. I have previously used bowtie to map reads from paired-end ChIP-Seq sequencing, then used the positions for peak-calling. However, I'm trying to do similar with a dataset which may consist of repeats (I…

asked Jan 24 '18 at 15:09

user36196

291
1
6

3

votes

2 answers

How to find/build the evolutionary history of a protein from its sequence?

I'd like to build the evolutionary history of a protein, given its sequence. Namely, given a FASTA entry how can I build an evolutionary tree? Here is the 5wxy protein as an…

asked Jan 23 '18 at 15:32

0x90

1,437
9
18

3

votes

1 answer

Removing repeated reads from nanopore 1D² reads

This is summary of discussions and a question that was posted by Devon O'Rourke on Twitter Following Albacore basecalls on a 1D² library I get two sets of .fq files, summary stats, etc.: one for the 1D basecalling script, and one for the 1D² script.…

asked Jan 17 '18 at 23:04

gringer

14,012
5
23
79

3

votes

1 answer

predict the foldability of single-stranded DNA molecules

I have a list of regions of the human genome and I want to predict if single-stranded molecules in a buffer would tend to fold and create pin structures by sequence self-complementarity. What's the most precise software/parameter set I can use to…

asked Jan 17 '18 at 10:16

719016

2,324
13
19

3

votes

2 answers

Is there a tool that can take a protein's amino acid sequence and would display it's locus on the genome?

I have the UNIPROT IDs, PDB IDs and FASTA files of several known proteins. I am looking for a tool that can take as input the protein's amino acid sequence and display the coding nucleotides of those amino acids. Moreover, I look for a tool that…

asked Jan 13 '18 at 18:06

Adrian Smith

357
1
7

3

votes

3 answers

How to demultiplex pair-end fastq reads with barcode 2 in the identifier line?

I have multiplexed pair-end fastq reads with dual barcodes. The issue is that one barcode is present in the header and one is present at the beginning of the read. I need a method to demultiplex this data, but in order to assign a read to an…

fastq

asked Jan 10 '18 at 20:29

Caleb Benson

31
2

3

votes

2 answers

Where can I find gene expression data on one of the cell lines in NCI-60?

I'm trying to retrieve the most expressed genes from a cell line from the NCI-60, lets say cell line NCI-H23, but I'm not sure where to find the gene expression data. My main goal is to create a network using that data and iRefIndex to be loaded…

asked Jan 10 '18 at 19:35

user1171426

109
2

3

votes

3 answers

Which reference to use for read mapping for popular model organisms

What is the "best" assembly for the popular model organisms: human (GRCh37 and GRCh38 are obvious, I'd pick whatever bwakit uses) mouse (GRCm37/GRCm38, OK) but what about non-human/mouse ones? fruit fly zebrafish E. coli any other idea?

asked May 31 '17 at 23:36

Manuel

588
4
5

3

votes

2 answers

How to download the whole BLAST nt database into a specific folder?

I have successfully downloaded the whole nt BLAST database into the current folder using: wget -b "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.??.tar.gz" However, I would like to download it in a specific folder, e.g. output/ I have tried: wget -b…

asked Jan 08 '18 at 11:53

Biomagician

2,459
16
30

3

votes

0 answers

Can second search with limited search space improve result in MS proteomics data?

First, please know that I am a novice in this field and that it is thus very likely that you will regard this question as basic or flat stupid. Please be gentle and don't be afraid to criticize. I am asking this so I can learn. Also, don't feel…

database

asked Jan 08 '18 at 10:30

Max Jonatan Karlsson

193
1
6

3

votes

1 answer

getting KeyError on certain residues in BioPython: ATOM vs. HETATM lines

When I want to load atoms for residuum (' ', 85, ' ') in chain A for protein 3tmm with structure[0][chain_id][residue], but I'm getting KeyError exception: Traceback (most recent call last): [...] File "/home/pok/Checkouts/bilkoviny/loader.py",…

asked Jan 04 '18 at 21:47

jhutar

165
3

3

votes

2 answers

Is there anything similar to GSEA for locus-based (instead of of gene-based) data?

As the question states, I am interested in an analysis similar to Gene Set Enrichment Analysis (ranked gene sets) but focused on locus-level data instead of genes. To explain in greater detail: I have a set of genomic coordinates from DNA…

asked Jan 03 '18 at 21:33

Reilstein

367
1
14

3

votes

1 answer

Normalization using parallel R script

I need to implement a parallel version of an R script because on one core it takes forever to execute. The function that takes the most of time is scran::quickCluster() applied to an integer sparse matrix (UMI_count) that is 26000x66000. The code is…

asked Jan 03 '18 at 18:59

Nikita Vlasenko

2,558
3
26
38

Most Popular