Questions tagged [python]

Python is a programming language commonly used for machine learning. Use this tag for any on-topic question that (a) involves Python either as a critical part of the question or expected answer, & (b) is not just about how to use Python.

Python (Wikipedia page) is a general purpose programming language designed for ease of use. It is a commonly used platform for machine learning. Two very popular threads concerned with using Python for statistics and machine learning are:

Be aware that Python-based questions are frequently migrated between Cross Validated (CV) and Stack Overflow (SO). CV fields questions with statistical / machine learning content, and SO fields questions of programming and implementation. Python questions can be on topic here when they are centrally about statistics / ML while involving Python either as a critical part of the question or expected answer. However, questions that are just about how to use Python / why it works a certain way, etc., are off topic here. Many such questions can be on topic on SO if they have a reproducible example.

We maintain a list of Python resources available on the internet in our Internet Support for Statistics Software meta.CV thread.

There is an extensive wiki for Python on SO here.

4791 questions

votes

2 answers

How to interpret p-value of Kolmogorov-Smirnov test (python)?

I have Two samples that I want to test (using python) if they are drawn from the same distribution. To do that I use the statistical function ks_2samp from scipy.stats. It returns 2 values and I find difficulties how to interpret them. Help please!

python

asked May 02 '13 at 09:16

meri

votes

1 answer

Python library for returning MLE for Beta Geometric and Beta Discrete Weibull models

There is an R package called foretell that is useful for projecting customer retention based on Beta Geometric and Beta Discrete Weibull models. I am having trouble finding something similar for python, at least one as streamlined. Does anyone know…

python

asked Jun 09 '21 at 19:34

Kbbm

votes

1 answer

Finding weight/value of each person on a team

If I have a team, with between $n_1$ and $n_2$ people per team, with results of team's head to head matchups, how would I be able to estimate each person's value? Example data (I drew this up quickly, the actual one is many lines longer, with more…

python

asked May 22 '20 at 00:24

qag54938bcaoo

votes

1 answer

PyTorch Ignore padding for LSTM batch training

I realize there is packed_padded_sequence and so on for batch training LSTMs, but that takes an entire sequence and embeds it then forwards it through the LSTM. My LSTM is built so that it just takes an input character then forward just outputs the…

python

asked Feb 15 '20 at 08:01

user8714896

votes

1 answer

Cross Validation for regularized portfolio optimization

Hi I'm having an explanation like below. I'm trying to find the minimum global portfolio and I found following explanation I need to use validation methods to use the optimal parameters. Also i need to use the regularizers. I'm ok with adding the…

python

asked Mar 04 '19 at 09:53

Hiru

votes

0 answers

Medical Imaging in Python (PyRadiomics): Concrete steps

I am new to medical imaging, but I am trying really hard to replicate some former analyses within this topic out of interest (e.g., https://www.ncbi.nlm.nih.gov/pubmed/26337765). My questions are: Are there any online resources that provide…

python

asked Jun 14 '18 at 08:38

Kim

vote

1 answer

Figuring out a good fit for this data

I am trying to find an appropriate mathematical model/equation for this data. Physically, it is essentially linear correlations of rainfall error (y-axis) with distance (x-axis). So for very short distances, the errors are highly correlated, but…

python

asked Jul 08 '22 at 04:54

AzureWinds

vote

1 answer

How to calculate the feature importance for multi-label classification problems

I am looking for some sources about "how to calculate the feature importance for multi-label classification problems". would you give me some information with related python source code on how to apply feature importance in multi-label datasets?

python

asked Jan 26 '22 at 18:46

Eleni Yenehun

vote

1 answer

How do I test how well my fit line predicts results?

I am a data science intern and I have been tasked with testing the time scalability of the schedule builder. Basically I have collected data and made a bunch of fit lines using the lmfit module. Now I need to run the schedule builder and see how…

python

asked Dec 07 '21 at 15:12

Evan Walker

vote

0 answers

How do I monitor the performance of ML models if the ground truth is delayed for 9 months?

We have deployed a machine learning model to production around 1 year ago. I would like to somehow estimate performance of my model (binary classifier), but we only get ground truth about 9 months after the prediction is made. I've noticed that the…

python

asked Oct 21 '21 at 15:43

Alexander De Leeuw

vote

0 answers

can you have observations without choice in pylogit?

I have data in long format where lets say user 1 has 10 alternatives but did not chose any alternative so CHOICE is all 0. The problem i get is that when i include those users all model parameters are set 0. I do not understand why it is happening…

python

asked Nov 25 '20 at 09:01

user7021605

vote

0 answers

Getting LinAlgError SVD did not converge err in sm.OLS.fit() in the first run only

Getting LinAlgError: SVD did not converge err in sm.OLS.fit() in the first run only. In the second run, the same code runs without any change in data and code. Already tried out StakeOverFlow solutions - Most likely there are nans in the data, you…

python

asked Aug 05 '20 at 17:56

Shailendra Kadre

vote

0 answers

Avoid Retraining a Model When Executing a Program?

I've started using OpenCV for some image processing projects and I'm wondering if there's a way to save time when it comes to processing test images against a database of faces. Issue: 10 pictures of each subject A, B, and C exist in folders on the…

python

asked Aug 14 '19 at 23:52

ev3670

vote

2 answers

How can I improve sentiment analysis of user comments?

I'm implementing sentiment analysis on the set of user comments. All comments are on the same object. At the moment I decided to have three classes - negative, neutral and positive. I got test array of 1500 comments with marked classes. Tried to use…

python

asked Jun 28 '12 at 13:08

egens

vote

0 answers

How does bootstrapping work?

So I'm trying to understand bootstrapping, I watched the following video: https://www.youtube.com/watch?v=gcPIyeqymOU&t=338s And starting from 2:53 the speaker explains that through bootstrapping we can get a closer inference on the population mean…

python

asked Jun 30 '17 at 20:03

bugsyb

2 Next