I've have several tables in a MySQL database that have rows with some duplicate column values and with a timestamp/created_at column for everyrow. As an example, I have an orders table that has a column for an 'accession_number' and a timestamp.
I is easy enough to get the most recent row (they are auto-indexed, but I'm using the timestamp to verify the time) for a specific accession_number by using a query like:
SELECT * FROM orders WHERE accession_number = ? ORDER BY timestamp DESC LIMIT 1
but there are instances where I want to get a result for all accession_numbers, but I want the most recent one in the case where there are multiple rows with the same accession_number and different timestamps.
A query similar to:
SELECT * FROM orders o1
WHERE timestamp = (SELECT MAX(timestamp) FROM orders o2 WHERE o1.accession_number = o2.accession_number)
seems the most intuitive but it seems to take a bit of time, and I'm actually using that as a subquery in a larger query with some JOINS.
Given that I'm using MySQL, just wondering if there is perhaps a better way to do that, or if this method is adequate. The queries with the JOINS do not take too long yet, but I'm concerned that things will slow down in a linear fashion as the DB gets larger.
I don't think the table is actually using the accession_number column as an index, so maybe that would help also ?
Editing to add suggested Query with results via Sequel Pro. I actually need quite a few columns (more than below) from the table and I want the values corresponding to the row with the most recent timestamp for a given accession_number. e.g.
SELECT
accession_number, patient_lname, patient_fname, referring_physician,
MAX(timestamp)
FROM
orders
GROUP BY
accession_number, patient_lname, patient_fname, referring_physician
ORDER BY
accession_number;
11.4 ms to first available row.
2.4ms to first available row for my original query.