0

Essentially, I am trying to read some text and count instances of letters. Very simple. However, no matter what I try, I get different results for "E" and "e", when I want combined results. Here is what I have:

import nltk
import re

f = open('mytext.txt')
raw = f.read()
#print raw
#print len(raw) #7234

raw.lower()

prompt = raw_input("Enter stuff: ") 

potato = re.compile(r'[a-z]*', re.IGNORECASE)
potato = re.match(r'[a-z]*', prompt, re.IGNORECASE)
if potato: 
   print raw.count(prompt)
else:
   print "try again"

#document control f "e" = 808
#print result "e" = 802, "E" = 6
jonrsharpe
  • 107,083
  • 22
  • 201
  • 376
SnarkShark
  • 342
  • 1
  • 6
  • 19
  • 4
    Try `raw = raw.lower()` rather than just `raw.lower()` – Kamehameha Jul 15 '15 at 12:58
  • Not sure if this matters, but if you are trying to convert raw to all lowercase, you have to save raw.lower() as a variable i.e. `raw_2 = raw.lower()` and then continue the rest of your code using raw_2 instead of raw – Sam cd Jul 15 '15 at 13:01
  • or do what @Kamehameha said – Sam cd Jul 15 '15 at 13:02

2 Answers2

0

Calling raw.lower() does nothing - you didn't store the result. Try this instead:

raw = raw.lower()
Luna
  • 1,367
  • 1
  • 16
  • 32
0

This:

raw.lower()

does not do what you think it does. Replace it with

raw = raw.lower()

Python strings are immutable; any operation that you perform on them will return a new string with that modification.

If you want it to be case-insensitive for both the input text and the prompt entered by the user, also change

prompt = raw_input("Enter stuff: ") 

to

prompt = raw_input("Enter stuff: ") 
prompt = prompt.lower()
bgporter
  • 33,237
  • 8
  • 58
  • 65