NZGames.com Forums
Register FAQ Calendar Mark Forums Read

Go Back   NZGames.com Forums > General > Technology & Troubleshooting
User Name
Password

Reply
 
Thread Tools
Old 27th November 2019, 11:46     #1
blynk
 
SQL Combining duplicate rows

Is there a quick method to combine rows together where they are the same (based on some ordering).
I have a history table, but alot of the rows are effectively duplicated.

I.e.
ID |Status |Value |update_date
1 |Open | $100 |01/01/2019
1 |Open | $100 |02/01/2019
1 |Pending| $100 |21/01/2019
1 |Open | $100 |25/01/2019

So I want a table that would be have rows 1 & 2 combined

1 |Open | $100 |01/01/2019
1 |Pending| $100 |21/01/2019
1 |Open | $100 |25/01/2019

Basic group by doesnt work, because it joins the first and the last rows together.

I have examples where there are 50 updates with no changes. Do i just need to do comparison.

Or I could do a delete I guess? if the next row fields = current row fields delete??
  Reply With Quote
Old 27th November 2019, 12:57     #2
DrTiTus
HENCE WHY FOREVER ALONE
 
You could use something like this:

SELECT a.* FROM table AS a
WHERE a.Status <>
(SELECT b.Status FROM table AS b
WHERE a.Value = b.Value AND a.update_date < b.update_date
ORDER BY b.update_date DESC LIMIT 1)

[stolen/modified from stackoverflow, money back guarantee. fixing the ordering is your job]

It might be quicker for you up front, but slow for the database. On a table with 1M+ rows and no indexes, it'll be doing 1M+ subqueries. 10K rows might be fine. Pi vs Xeon, etc. YMMV.
__________________
Finger rolling rhythm, ride the horse one hand...
  Reply With Quote
Old 27th November 2019, 13:52     #3
blynk
 
Thanks, I have several fields I want to check to see if they have changed.

I have attached a row number to each row partitioning by all the fields I want to check ordering it by the date.
Then I will select the first row
  Reply With Quote
Old 27th November 2019, 16:36     #4
Lightspeed
 
Couldn't you just build your table with a DISTINCT clause?
__________________
Stay shook. No sook.
  Reply With Quote
Old 25th January 2020, 13:04     #5
Lightspeed
 
So what did you come up with? This isn't Stack Overflow here!
__________________
Stay shook. No sook.
  Reply With Quote
Old 27th January 2020, 19:00     #6
crocos
 
Quote:
Originally Posted by Lightspeed
Couldn't you just build your table with a DISTINCT clause?
No, because in his example they're not actually duplicated.

Hey Blynk: Why do you say the first two rows are duplicated? Different date!
__________________
Ξ √ Ω L U T ↑ ☼ N

وكل يوم كنت تعيش في العبودية
  Reply With Quote
Old 28th January 2020, 12:44     #7
Lightspeed
 
Haha, how many times have I or a colleague puzzled over some code or query, wondering why we can't get something so obvious to work, only to discover our query was fine, we just hadn't paid enough attention to the data.
__________________
Stay shook. No sook.
  Reply With Quote
Old 28th January 2020, 19:30     #8
blynk
 
Quote:
Originally Posted by crocos
No, because in his example they're not actually duplicated.

Hey Blynk: Why do you say the first two rows are duplicated? Different date!
Maybe a better explanation was needed at the start.
I have a history table that tracks changes. However for some reason, there are rows that are "Updated" but nothing is different.
So all the values are the same except the update date.

Based on that, I was not interested in the 2nd row as that causes issues with tracking of what happened.

In that example, nothing actually happened on the 2nd, but the data looked like it did.
  Reply With Quote
Old 30th January 2020, 23:43     #9
crocos
 
OK... is there additional data that we're missing? Is there another table you can cross-reference it to, or a code other than "ID = 1"? If you do, then you have a chance of filtering it using basic SQL

If not, you'll need to start mucking with sub-selects to filter out items that have an identical transaction from the previous day - I assume it's only the previous day you're talking about and not an ordered thing.

If it is purely an order thing and not a next-day-only thing, you may need to build a result-set in a stored procedure or stored function (depending on the DB type and how you want to access it) to run across the DB using a cursor and retain the previous line, compare the line excluding date before adding to result-set and if it matches don't add it.
__________________
Ξ √ Ω L U T ↑ ☼ N

وكل يوم كنت تعيش في العبودية
  Reply With Quote
Old 31st January 2020, 00:33     #10
Lightspeed
 
blynk has a "history table", or a chronologically ordered list.

Basically they want to record the first row, then exclude any subsequent row that matches all columns in this row with the exception of the "update_date" column, recording the next row that does not match all columns (i.e. a genuine update), and so on.

I think it can be done, but I'm too lazy to think through the nested structures to figure it out.
__________________
Stay shook. No sook.
  Reply With Quote
Old 31st January 2020, 12:01     #11
crocos
 
Quote:
Originally Posted by Lightspeed
I think it can be done, but I'm too lazy to think through the nested structures to figure it out.
I'd already thought of three different ways to do it above, in increasing complexity depending on how his data is structured
__________________
Ξ √ Ω L U T ↑ ☼ N

وكل يوم كنت تعيش في العبودية
  Reply With Quote
Old 31st January 2020, 14:08     #12
blynk
 
I've done it through staging tables.
Can't remember extactly but I ordered them and created a rank based on the fields I wanted to track. Which meant #1 was always the first change.
Everything else was ignored.
  Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump



© Copyright NZGames.com 1996-2023
Site paid for by members (love you guys)