Une des demandes les plus récurrentes, lorsque l’on fait des tableaux de bord c’est de retrouver les « 10 meilleurs clients par zone commerciale », ou encore les « 10 plus gros contributeurs facebook par page » .
Create TABLE Contribs (id_page int, page varchar(50), fan varchar(50) , contribs INT);
INSERT INTO Contribs VALUES (2, ‘françois Hollande’, ‘toto’ , 65);
INSERT INTO Contribs VALUES (2,’françois Hollande’, ‘titi’ , 50);
INSERT INTO Contribs VALUES (2,’françois Hollande’, ‘tutu’ , 30);
INSERT INTO Contribs VALUES (2,’françois Hollande’, ‘tata’ , 20);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘toto’ , 5);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘titi’ , 41);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘tutu’ , 35);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘tata’ , 66);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘tete’ , 67);
INSERT INTO Contribs VALUES (24,’Marine Le Pen’, ‘toto’ , 35);
INSERT INTO Contribs VALUES (24,’Marine Le Pen’, ‘tata’ , 15);
INSERT INTO Contribs VALUES (24,’Marine Le Pen’, ‘titi’ , 5);
INSERT INTO Contribs VALUES (24,’Marine Le Pen’, ‘tutu’ , 3);
INSERT INTO Contribs VALUES (24,’Marine Le Pen’, ‘zozo’ , 2);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘zizi’ , 69);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘zaza’ , 100);
INSERT INTO Contribs VALUES (8,’Nicolas Sarkozy’, ‘zuzu’ , 1230);
Sql Server:
select * from ( select page, fan, contribs, RANK() over ( Partition by page order by contribs DESC) as Rank
from Contribs ) as TEMP
where Rank <=10;
MySql:
Je m’explique:
au sein même de la requête, on utilise l’incrémentation au fur et à mesure que le moteur lit et ordonne les lignes qu’il trouve.
On doit d’abord ordonner les contributeurs par page:
select page, fans, contribs from Contribs order by page, contribs DESC;
On introduit l’incrémentation au travers d’une variable : @count.
Dès que le moteur de requête change de page, il est nécessaire de remettre à zéro la variable @count.
L’indication du changement de page se fera avec la variable : @oldPage
select @oldPage := 0 , @count := 0;
SELECT
@count := IF(id_page <=> @oldPage, @count+1, 0) as mycount,
@oldPage := id_page,
page,
fan,
contribs
FROM Contribs
ORDER BY id_page, contribs DESC;
Il suffit ensuite de filtrer :
select fan,page, contribs from (SELECT
@count := IF(id_page <=> @oldPage, @count+1, 0) as mycount,
@oldPage := id_page,
page,
fan,
contribs
FROM Contribs
ORDER BY id_page, contribs DESC) t
where mycount<=10;
ANSI SQL:
On peut essayer d’imaginer d’extraire pour chaque contribution ( contribs) le sous-ensemble des contributions supérieures et inférieures.
Pour chaque contribs, on peut extraire le sous ensemble des contribs supérieures:
select C2.contribs, C1.contribs
from Contribs as C1
join Contribs as C2 on C1.contribs >= C2.contribs
On veut maintenant n’avoir que les contribs qui n’ont que trois contribs qui lui sont supérieurs. On va compter pour chaque sous ensemble le nombre distinct de contribs :
select min(C1.contribs)
from Contribs as C1
join Contribs as C2 on C1.contribs >= C2.contribs
group by C2.contribs
having count(DISTINCT C1.contribs) <=3;
Avec cette requête, on récupère ainsi les trois plus hautes contribs c’est à dire 69,100 et 1230
Enfin, on veut faire le regroupement par page :
select min(C1.contribs) , C1.page
from Contribs as C1
join Contribs as C2 on C1.contribs >= C2.contribs and C1.id_page = C2.id_page
group by C2.contribs, C1.page
having count(DISTINCT C1.contribs) <=3;
Il ne reste plus qu’à injecter les noms des contributeurs :
select c.* from Contribs c
JOIN
(select min(C1.contribs) as contribs, C1.id_page
from Contribs as C1
join Contribs as C2 on C1.contribs >= C2.contribs and C1.id_page = C2.id_page
group by C2.contribs, C1.id_page
having count(DISTINCT C1.contribs) <=3) t
on t.id_page = c.id_page and c.contribs = t.contribs;
Et voilà: on récupère les trois meilleurs contributeurs par page!!
C’est d’ailleurs la requête la plus élégante et celle qui fonctionnera partout.
MDX:
Alors là, le système est très simple avec l’instruction TopCount puisqu’il permet de faire le filtre pour chaque sous-ensemble.
Par exemple, on crée un « set » des clients à qui on a le plus vendu sur 2012, puis on peut l’utiliser en dans une requête MDX:
with set [Top10cust] as ‘TopCount({[Client].[ClientName].Members}, 10.0, ([Time].[2012], [Measures].[sales]))’
member [Measures].[Ventes] as ‘[Measures].[sales]’
select {[Measures].[Ventes]} on COLUMNS,
{[Top10cust]} ON ROWS
from [Commercial]