2

First time asking a question on SO.

I am trying to find a fast way to read the screen live (60fps+). Screenshot to numpy is a fast method, but does not match that speed. There is a brilliant answer in this question for pixels: Most efficient/quickest way to parse pixel data with Python?

I tried changing GetPixel to this long form for BMP, but that reduces it to 5fps:

t1 = time.time()
count = 0

width = win32api.GetSystemMetrics(win32con.SM_CXVIRTUALSCREEN)
height = win32api.GetSystemMetrics(win32con.SM_CYVIRTUALSCREEN)
left = win32api.GetSystemMetrics(win32con.SM_XVIRTUALSCREEN)
top = win32api.GetSystemMetrics(win32con.SM_YVIRTUALSCREEN)
while count < 1000:
    hwin = win32gui.GetDesktopWindow()
    hwindc = win32gui.GetWindowDC(hwin)
    srcdc = win32ui.CreateDCFromHandle(hwindc)

    memdc = srcdc.CreateCompatibleDC()

    bmp = win32ui.CreateBitmap()
    bmp.CreateCompatibleBitmap(srcdc, width, height)
    memdc.SelectObject(bmp)

    memdc.BitBlt((0, 0), (width, height), srcdc, (left, top), win32con.SRCCOPY)
    bmpinfo = bmp.GetInfo()
    bmpInt = bmp.GetBitmapBits(False)

    count +=1
t2 = time.time()
tf = t2-t1
it_per_sec = int(count/tf)
print (str(it_per_sec) + " iterations per second")

I watched a youtube video of a guy working on C# where he said GetPixel opens and closes memory and that's why doing a GetPixel on each individual pixel has a lot of overhead. He suggested to lock the entire data field and only then do getpixel. I don't know how to do that, so any help will be appreciated. (EDIT: this link might refer to that Unsafe Image Processing in Python like LockBits in C# )

There is also another method which gets a memory address of the bitmap, but I don't know what to do with it. The logic there is that I should be able to read memory from that point into any numpy array, but I have not been able to do that.

Any other option to read the screen fast will also be appreciated.

There must be a way, the GPU knows what pixels to draw at each location, that means there must be a memory bank somehere or a data stream we can tap into.

P.S. why a highspeed requirement? I am working on work automation tools that have a lot of overhead already and I am hoping to optimize screen data stream to help that part of the project.

  • You will probably need cooperation from the graphics hardware to do this in C, then write bindings to Python to allow it to access the captured stream. NVIDIA's Capture SDK might be a place to start looking. – Alex Reinking Jul 10 '18 at 01:26
  • How do streaming softwares work? They have some sort of access to it right? I will try to check the NVIDIA SDK, but wish it was more universal so I could try it on my intel integrated laptop too. – Vitalijs Arkulinskis Jul 10 '18 at 18:14
  • The drivers can read the final framebuffer before the data gets sent to the display device. – Alex Reinking Jul 10 '18 at 18:25

1 Answers1

2

The code below uses MSS, which if modified to show no output can reach 44fps for 1080p. https://python-mss.readthedocs.io/examples.html#opencv-numpy

import time

import cv2
import mss
import numpy


with mss.mss() as sct:
    # Part of the screen to capture
    monitor = {'top': 40, 'left': 0, 'width': 800, 'height': 640}

    while 'Screen capturing':
        last_time = time.time()

        # Get raw pixels from the screen, save it to a Numpy array
        img = numpy.array(sct.grab(monitor))

        # Display the picture
        #cv2.imshow('OpenCV/Numpy normal', img)

        # Display the picture in grayscale
        # cv2.imshow('OpenCV/Numpy grayscale',
        #            cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY))

        print('fps: {0}'.format(1 / (time.time()-last_time)))

        # Press "q" to quit
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

Still not perfect though as it is not 60fps+ and using a raw repackaged buffer from the GPU would be a better solution if possible.