|
27th November 2019, 11:46 | #1 |
|
SQL Combining duplicate rows
Is there a quick method to combine rows together where they are the same (based on some ordering).
I have a history table, but alot of the rows are effectively duplicated. I.e. ID |Status |Value |update_date 1 |Open | $100 |01/01/2019 1 |Open | $100 |02/01/2019 1 |Pending| $100 |21/01/2019 1 |Open | $100 |25/01/2019 So I want a table that would be have rows 1 & 2 combined 1 |Open | $100 |01/01/2019 1 |Pending| $100 |21/01/2019 1 |Open | $100 |25/01/2019 Basic group by doesnt work, because it joins the first and the last rows together. I have examples where there are 50 updates with no changes. Do i just need to do comparison. Or I could do a delete I guess? if the next row fields = current row fields delete?? |
27th November 2019, 12:57 | #2 |
HENCE WHY FOREVER ALONE
|
You could use something like this:
SELECT a.* FROM table AS a WHERE a.Status <> (SELECT b.Status FROM table AS b WHERE a.Value = b.Value AND a.update_date < b.update_date ORDER BY b.update_date DESC LIMIT 1) [stolen/modified from stackoverflow, money back guarantee. fixing the ordering is your job] It might be quicker for you up front, but slow for the database. On a table with 1M+ rows and no indexes, it'll be doing 1M+ subqueries. 10K rows might be fine. Pi vs Xeon, etc. YMMV.
__________________
Finger rolling rhythm, ride the horse one hand... |
27th November 2019, 13:52 | #3 |
|
Thanks, I have several fields I want to check to see if they have changed.
I have attached a row number to each row partitioning by all the fields I want to check ordering it by the date. Then I will select the first row |
27th November 2019, 16:36 | #4 |
|
Couldn't you just build your table with a DISTINCT clause?
__________________
Stay shook. No sook. |
25th January 2020, 13:04 | #5 |
|
So what did you come up with? This isn't Stack Overflow here!
__________________
Stay shook. No sook. |
27th January 2020, 19:00 | #6 | |
|
Quote:
Hey Blynk: Why do you say the first two rows are duplicated? Different date!
__________________
Ξ √ Ω L U T ↑ ☼ N وكل يوم كنت تعيش في العبودية |
|
28th January 2020, 12:44 | #7 |
|
Haha, how many times have I or a colleague puzzled over some code or query, wondering why we can't get something so obvious to work, only to discover our query was fine, we just hadn't paid enough attention to the data.
__________________
Stay shook. No sook. |
28th January 2020, 19:30 | #8 | |
|
Quote:
I have a history table that tracks changes. However for some reason, there are rows that are "Updated" but nothing is different. So all the values are the same except the update date. Based on that, I was not interested in the 2nd row as that causes issues with tracking of what happened. In that example, nothing actually happened on the 2nd, but the data looked like it did. |
|
30th January 2020, 23:43 | #9 |
|
OK... is there additional data that we're missing? Is there another table you can cross-reference it to, or a code other than "ID = 1"? If you do, then you have a chance of filtering it using basic SQL
If not, you'll need to start mucking with sub-selects to filter out items that have an identical transaction from the previous day - I assume it's only the previous day you're talking about and not an ordered thing. If it is purely an order thing and not a next-day-only thing, you may need to build a result-set in a stored procedure or stored function (depending on the DB type and how you want to access it) to run across the DB using a cursor and retain the previous line, compare the line excluding date before adding to result-set and if it matches don't add it.
__________________
Ξ √ Ω L U T ↑ ☼ N وكل يوم كنت تعيش في العبودية |
31st January 2020, 00:33 | #10 |
|
blynk has a "history table", or a chronologically ordered list.
Basically they want to record the first row, then exclude any subsequent row that matches all columns in this row with the exception of the "update_date" column, recording the next row that does not match all columns (i.e. a genuine update), and so on. I think it can be done, but I'm too lazy to think through the nested structures to figure it out.
__________________
Stay shook. No sook. |
31st January 2020, 12:01 | #11 | |
|
Quote:
__________________
Ξ √ Ω L U T ↑ ☼ N وكل يوم كنت تعيش في العبودية |
|
31st January 2020, 14:08 | #12 |
|
I've done it through staging tables.
Can't remember extactly but I ordered them and created a rank based on the fields I wanted to track. Which meant #1 was always the first change. Everything else was ignored. |