The pfx_dw
schema is a fairly textbook "star schema", with things like jobs and worker usage in fact tables, and the various things that you might use in a SQL WHERE
clause in dimension tables.
The job
...
dimension table, a typical "fact" table.
For example, to get reports about various jobs over time, you'll be querying the pfx_dw.job_fact table:
...
Code Block | ||
---|---|---|
| ||
mysql> SELECT * FROM jobstatus_dim; +--------------+------------+-------------+----------------+-------------+ | jobstatus_sk | status_int | status_char | effective_date | expiry_date | +--------------+------------+-------------+----------------+-------------+ | 1 | 16 | complete | 1999-12-31 | 9999-12-31 | | 2 | 32 | failed | 1999-12-31 | 9999-12-31 | | 3 | 48 | killed | 1999-12-31 | 9999-12-31 | | 4 | 272 | blocked | 1999-12-31 | 9999-12-31 | | 5 | 288 | waiting | 1999-12-31 | 9999-12-31 | | 6 | 304 | suspended | 1999-12-31 | 9999-12-31 | | 7 | 320 | pending | 1999-12-31 | 9999-12-31 | | 8 | 323 | waiting | 1999-12-31 | 9999-12-31 | | 9 | 325 | badlogin | 1999-12-31 | 9999-12-31 | | 10 | 336 | running | 1999-12-31 | 9999-12-31 | | 11 | 261 | dying | 1999-12-31 | 9999-12-31 | +--------------+------------+-------------+----------------+-------------+ |
Get a count of all jobs for a particular user for January, 2014:
...
Get the sum total of cpu_seconds used for each user for the last 7 days, broken down by user, date, and the job's final status:
...
...
language | sql |
---|
...
SELECT user . name |
...
, time . date |
...
, status.status_char |
...
, SUM (fact.cpu_seconds) |
...
as |
...
"cpu_time" |
...
FROM job_ |
...
fact AS |
...
fact |
...
INNER |
...
JOIN user_ |
...
dim AS |
...
user |
...
ON fact.user_sk= user .user_sk |
...
INNER |
...
JOIN time_ |
...
dim AS |
...
time |
...
ON fact.time_sk= time .time_sk |
...
INNER |
...
JOIN jobstatus_ |
...
dim AS |
...
status |
...
ON fact.jobstatus_sk=status.jobstatus_sk |
...
WHERE DATEDIFF(CURDATE(), |
...
time .date_time) |
...
< |
...
7 |
...
GROUP |
...
BY user . name |
...
, time . date |
...
, status.status_int |
...
ORDER |
...
BY time . date |
...
, cpu_ |
...
time DESC , status.status_char |
...
; + --------+------------+-------------+----------+ |
...
| |
...
name |
...
| |
...
date | status_char |
...
| |
...
cpu_time |
...
| |
...
+ --------+------------+-------------+----------+ |
...
<< snipped >> | jburk | 2014-07-14 |
...
| complete | 351036 | | jburk | 2014-07-14 |
...
| killed | 60029 | | jburk | 2014-07-14 |
...
| failed | 139 | | coxj | 2014-07-14 |
...
| killed | 98 | | garza | 2014-07-14 |
...
| killed | 0 | | jburk | 2014-07-15 |
...
| complete | 28910 | | fubar | 2014-07-15 |
...
| complete | 18610 | | foobar | 2014-07-15 |
...
| complete | 18561 | | jburk | 2014-07-15 |
...
| killed | 16967 | | jburk | 2014-07-15 |
...
| failed | 27 | | jburk | 2014-07-16 |
...
| complete | 46797 | | jburk | 2014-07-16 |
...
| killed | 17136 | | jburk | 2014-07-16 |
...
| failed | 2 | << snipped >> + --------+------------+-------------+----------+ |