mysql - SQL join 3 tables grouped with latest group value -
i have 3 tables. user, messages , user_analytics following structure:
- user (userid) - contains users
- message (messageid(pk),userid(fk),time) - contains messages
- user_analytics (user_analyticsid(pk),userid(fk),device,time) -
contains data collected on connection
user : messages (1:n) user : device (1:n)
now know how many messages sent on each day device. therefor first need collect each message device (desktop,ios,android) used send message depending on message time itself. means need user_analytics.time <= message.time , display latest result.
i saw lot solutions greatest-n-per-group didn't work.
i work subquery takes 20 seconds (user_analytics holds 100k records , message 3k... not much):
select date_format(m.time,'%y-%m-%d') date, count(*) message_count, ua.device message m, user u left join user_analytics ua on ( u.userid = ua.userid , ua.user_analyticsid = ( select max(user_analyticsid) user_analytics userid = m.userid , time < m.time)) m.userid = u.userid group 1,3;
but looks highly inefficient. other ways achieve same result?
update: forgot mention have important condition on user table. why need join table.
i created sql fiddle give example. implemented jaguar chang's solution 100 times faster mine:
there's no necessity of joining user table, simplify code this:
select date_format(m.time,'%y-%m-%d') date, count(*) message_count, ua.device message m, left join user_analytics ua on ( m.userid = ua.userid , ua.user_analyticsid = ( select max(user_analyticsid) user_analytics userid = m.userid , time < m.time)) group 1,3;
this may not efficient enough, try this:
select date_format(t2.time,'%y-%m-%d') date, count(*) message_count, t2.last_device (select @device := if(@uid = userid, if(tbl = 'm' ,@device, device), if(@uid := userid,device,device)) last_device ,t1.* (select @device := '' , @uid :=0) t0 join (select 'ua' tbl,userid,time,device user_analytics union select 'm' tbl,userid,time,null device messages ) t1 order userid,time ) t2 tbl='m' group 1,3;
i guess original purpose divide messages times of connection on device, order messages , connection record time series together, device used last connection every message.
i think approach efficient, since 100k+3k sorting faster 3k*100k*100k join operations
a test sql fiddle demo.
Comments
Post a Comment