Because everything’s better with bacon

comprar sildenafil viagra prijs pilule levitra tadalafil moins cher vendo viagra vendo cialis cialis te koop acheter cialis sur internet acquisto viagra senza ricetta medicament cialis levitra italia vardenafil generico prix de cialis compro levitra viagra sans prescription medicament levitra acquisto levitra vardenafil bestellen trouver du levitra cialis pharmacie levitra sur internet generique du viagra sildenafil bestellen compro viagra viagra kosten comprar cialis cialis venta libre commander kamagra compra viagra acheter cialis en belgique ordina levitra kamagra te koop viagra pharmacie pharmacie en ligne acquisto viagra on line levitra france impuissance erection cialis kauf achat cialis 20mg levitra sur le net viagra donne generische levitra comprar cialis generico viagra ricetta acheter tadalafil commander du cialis acheter cialis internet viagra farmacia costo levitra cialis ohne rezept cialis vente libre viagra quanto costa levitra en pharmacie posologia viagra acquisto viagra zithromax generique ordina viagra acheter isotretinoine viagra rezeptfrei tadalafil generique comprar vardenafil generique du cialis commander du viagra vente levitra acheter kamagra 100mg generische viagra achat cialis propecia prix viagra dosaggio tadalafil 10 mg levitra generico compro levitra achat viagra en ligne acheter kamagra kamagra pharmacie aquisto viagra acheter cialis en espagne trouble erection viagra ordonnance cialis donne vente viagra cialis receta cialis vente en ligne vendita levitra viagra recensioni acheter zyban kamagra oral jelly acheter cialis en pharmacie acheter finasteride viagra te koop levitra venta cialis ricetta medica vardenafil generique sildenafil receta acheter cialis pas cher pastilla viagra viagra ricetta medica medicament impuissance comprar levitra generica impotenza sessuale tadalafil precio achat cialis generique viagra svizzera cialis belgique acheter clomid en france viagra cialis differenze cialis livraison rapide levitra rezeptfrei dysfonction erectile acheter du cialis cialis generico vente de cialis acheter cialis sur la net cialis effet secondaire kamagra rezeptfrei levitra precio acquisto viagra svizzera impuissance homme compro sildenafil prozac sans ordonnance pastilla sildenafil comprar viagra em portugal compro cialis levitra pharmacie prezzi levitra kamagra generique acquista levitra vendo viagra milano sildenafil rezeptfrei viagra fur frauen viagra effet secondaire cialis prescrizione

Friday, November 6, 2009

Refactoring!

Last night at the hackathon,  we refactored one of our queries from my review of Refactoring SQL Applications.*

First, we had a duplicate field name in the original select.  Not a problem if you’re just doing a select, but if you want to create a table (temp or otherwise) from the data, it won’t work.  So we replaced the first num_rows with rows_in_bytes.

Also, reading over this 5 months after the original attemp, I realize it’s a lot clearer if we don’t use table aliases in the outer SELECTs.

Then, we got some advice from Greg Smith that we shouldn’t do joins on pg_class.relname – this can screw you up if you have different schemas with identical table names.  You want to use oids (which I’d always thought was not desirable, but I’m assured it’s ok if you’re doing it with the system tables – you don’t want your application to depend on them, though. :) )  So, instead, we match pg_namespace.oid with pg_class.relnamespace.

Selena’s illustration of how this works:
SELECT relname, relkind FROM pg_class
JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace WHERE relkind = 'r' AND pg_namespace.nspname = 'public';

The new & improved version of the query can be found on the Pg wiki.

I wanted to compare the new query against the old, so I created a couple of temp tables containing the results… and discovered we had a couple of data discrepancies:  a few of our tables were listed twice in the original query results, with different values for num_rows, only one of which was correct for the current schema:

portal=# SELECT tablename, rows_in_bytes, num_rows FROM index_experiment_1
WHERE tablename IN  ('detectorid_count','stations','test_agg')
ORDER BY 1;
tablename     | rows_in_bytes | num_rows
------------------+---------------+----------
detectorid_count | 0 bytes       |        0
detectorid_count | 631 bytes     |      631
stations         | 22 bytes      |       22
stations         | 350 bytes     |      350
test_agg         | 0 bytes       |        0
test_agg         | 1386 bytes    |     1386
(6 rows)

portal=# SELECT count(*) from detectorid_count;
count
——-
0
(1 row)

portal=# SELECT count(*) from stations;
count
——-
350
(1 row)

portal=# SELECT count(*) from test_agg ;
count
——-
0
(1 row)

It turns out we’d run into the exact problem that Greg had warned us about.  The additional rows were from identically-named tables in other namespaces.

Find your namespaces:
portal=# SELECT nspname from pg_namespace order by 1;
nspname
--------------------
information_schema
pg_catalog
pg_temp_1
pg_temp_2
pg_toast
pg_toast_temp_1
pg_toast_temp_2
public
selena
wendell
(10 rows)

Find your data:
portal=# SELECT count(*) from selena.detectorid_count ;
count
-------
631
(1 row)

portal=# SELECT count(*) from wendell.stations ;
count
——-
22
(1 row)

portal=# SELECT count(*) from selena.test_agg ;
count
——-
1386
(1 row)

Note that these match the additional data from our original query.

Thanks, Greg!


* No, I haven’t finished reading it yet…I don’t read during the summer, I ride my bike.

posted by gabrielle at 6:35 pm  

Monday, October 19, 2009

PgWest: Sunday

We arrived at the conference site to find that the XML Data Warehousing had been canceled, so I spent that session in the Hackers’ Lounge attempting to continue work on pg_proctab, while getting kicked off the commie college wireless.

In Lists and Recursions and Trees, Oh My!, David Fetter gave us some example of old kludges to get row numbers out of Pg – “Not only is it slow, but it’s wrong” – but you may not notice that subtle wrongness in huge data sets.  This really illustrated the value of testing your data.

After lunch, I went to Josh Berkus’s 5 steps to PostgreSQL Performance Tuning.

He gave us some rules of thumb for figuring out how much RAM & CPU you need, but also recommends hiring a hardware geek to design your system for you – because vendors lie. :)   Try hardware out before you purchase it, or definitely test them within the warranty period.  And, here’s another use case for pg_proctab (other than my own amusement):  capacity planning.

Tip:  Don’t use autovacuum for data warehousing applications, or where you have large number of writes happening at once.  Manually vacuum those.

(An additional tip from me:  if you’re using linux, try increasing the default readahead buffer from 1024K to at least 1M for an ~80% performance improvement.  See our [in]famous file systems talk for the graphs to back this up.)

Thanks for another wonderful conference experience, PgPeeps!  See you again soon!

posted by gabrielle at 3:46 pm  

Monday, October 19, 2009

PGWest: Saturday

This past weekend was the 3rd annual PgWest.  The conference moved up to Seattle this year, and I think it was the biggest it’s ever been.  As usual, there were more interesting talks scheduled than I had time to attend.  (This is the 21st century;  where’s my time machine?)

For my first tech conferences a few years ago, I only went to sessions that were meaningful for my job.  I’ve since had a much better time (and learned more) by choosing which sessions I’ll attend based on the following criteria, in this order:
1) topic interestingness
2) speaker interestingess
3) relevance to my job duties

(See Tips #1 and #2 in Skud’s recent Ten tips for tech conference attendees post.)

So, right out of the gate at PgWest, I’m in a python talk* – Adrian K’s (of LinuxFestNW fame) discussion on Dabo.  Dabo’s a python desktop framework;  I program primarily in Perl, and I’ve never touched a desktop app.  Adrian’s example project was a management system for a plant nursery, which I *do* understand, so I had a point of reference into the material (the methods & options used to track plants made sense to me).  I really wanted to talk to him more about this app, but never caught up with him.  (The hallway track felt kind of rushed for me this time.)  I got a good idea for form validation – if user tries to enter a blank value where one is not allowed, they get a pop-up immediately and the original text (if there was any) is put back in the field, forcing the user to accept the original input or enter something new before they can proceed to the next field.  This is a step up from giving the user the error message after they’ve submitted the form.

Next we were on to JD’s keynote, featuring the usual heckling of and by the podium.

Then Mark’s & my talk about pg_proctab, which ended with some live demos & some audience participation, the way I like it.

A bunch of us went to lunch at Honeyhole Sandwiches, where I tried the “Texas Tease” – BBQ chicken.  The sandwich was excellent.  I *highly* recommend the fries.

Scott Bailey’s Temporal Data talk was *packed*.  He talked about the “period” datatype, featured in both his own (Chronos) and Jeff Davis’s PgTemporal project.  You can do unions & intersects on time periods.  I am thinking this would be a useful datatype for searching large tables of log entries.

Based on Scott’s talk, I decided to go to Jeff’s “Not Just UNIQUE” talk, because he would be discussing this in a little more detail.  This meant I missed the session on backup & recovery.  (See comment above about more material than I can fit in my schedule.)

I spent the last session partly in the hackers’ lounge, working on some pg_proctab wrapper scripts with Mark.

Then it was off to the EDB-sponsored after-party, where I caught up with Lloyd Albin, who spoke at PDXPUG about a year ago.  He brought me up-to-date on the work he’s done on the project, including a twitter feed to let clients know of updates, which I think is really cool.

*Which I was late to, because we were installing the snacks in the Hackers’ Lounge (thanks, Mark!)

posted by gabrielle at 3:35 pm  

Tuesday, October 13, 2009

My picks for PgWest

(I’ll be missing Friday’s tutorials.)

Saturday:
9am:  Jeff Davis:  PostgreSQL, Extensible to the Nth Degree.  Jeff’s talks usually melt my brain, and I like that.
10:15:  Conference Keynote.
11:30am:  Mark Wong: pg_proctab.  Turns out I’m giving this talk with Mark, even though my name’s not on the schedule.  I should probably show up.
1:45pm:  Scott Bailey:  Temporal Data or Magnus Hagander:  Secure PostgreSQL Deployment.  There will be a coin toss.
3:00pm:  Kevin Kempter:  Backup and Recovery.  There’s always something else to learn about this topic.
4:00pm:  Bill Karwin:  Practical Full-text Search.

Sunday:
9:00am:  Aaron Sheldon:  XML Data Warehousing.
10:15am:  David Fetter:  Lists and Recursion and Trees (Oh, My!)  I want to learn about Windowing functions, new with 8.4
11:15am:  Matt Smiley:  Basic Query Tuning Primer.  Another topic I could stand to learn more about.
1:30pm:  Tossup between David Wheeler:  pgTAP Unit Testing Best Practices and Josh Berkus:  5 Steps to PostgreSQL Performance.  I’ll probably go to Berkus’s talk because Wheeler is a sport about repeating his talks for PDXPUG.

Other fun stuff:

The Hacker lounge will be open for two days of geekery:  7:30 am – 4:30pm Saturday, and 9-4 on Sunday.
EnterpriseDB has stepped up to provide entertainment after the Saturday sessions.
I haven’t heard if there are Lightning Talks, but I have a couple of ideas for one.  You should too.

See you there!

posted by gabrielle at 5:24 pm  

Thursday, October 8, 2009

Are you going to PgWest?

At a loss for what to do next weekend?  Grab your rain gear & head on up to Seattle for PgWest 2009.

There’ll be three days of talks & tutorials plus a hackers’ lounge.   After-party plans are nebulous at this time, but we are researching options.  (Psst–pub crawl!)

Come join the fun!

At a loss for what to do next weekend?  Grab your rain gear & head on up to Seattle for PgWest 2009: http://www.postgresqlconference.org/2009/west/.

Three days of talks & tutorials http://www.postgresqlconference.org/2009/west/schedule plus a hackers’ lounge.  http://wiki.postgresql.org/wiki/Hackers%27_Lounge.   After-party plans are nebulous at this time.  (Psst–pub crawl!)

Come join the fun!

posted by gabrielle at 5:07 pm  

Wednesday, September 23, 2009

PDXPUG Patch Review Party

(just in case you haven’t read about it yet):

http://pugs.postgresql.org/node/584

posted by gabrielle at 11:30 am  

Saturday, June 6, 2009

Book Review (part I): Refactoring SQL Applications, with bonus queries

It’s taking me quite a while to wade through Stephan Faroult’s Refactoring SQL Applications. I just finished Chapter 2 & figured I’d better just go ahead with the review.

It’s quite humorous – I mean, there’s a section called “Queries of Death” – but this is some dense material, make no mistake. I tried to keep my copy nice so I could loan it to others, but I had to give up and get out The Pen, and it’s been highlighted and scribbled on.

Small gripe: the layout of the example queries makes them hard to read (capitalizing the conditionals would help). I’d also like to see more examples of result sets.

The section about statistics sparked a lively discussion on #pdxpug about cardinality vs selectivity*. What I thought I knew about indexes has been thrown on its head – don’t base your decisions just on whether or not the column in question is searched on.

One of the recommendations for “Sanity Checks” is to take a good look at your indexes. For starters, check for tables with no indexes, or a lot of indexes. There’s a sample query to pull the number of rows, indexes, and some info about those indexes for each table. Faroult only shows sample queries for Oracle, SQL Server, and MySQL, so Selena & I put our heads together & came up with an equivalent for PostgreSQL:

(Only works on 8.3; ditch the pg_size_pretty if you’re on an earlier version)

SELECT
    t.tablename,
    pg_size_pretty(c.reltuples::bigint) AS num_rows,
    c.reltuples AS num_rows,
    count(indexname) AS number_of_indexes,
    CASE WHEN x.is_unique = 1 THEN 'Y'
       ELSE 'N'
    END AS unique,
    SUM(case WHEN number_of_columns = 1 THEN 1
              ELSE 0
            END) AS single_column,
    SUM(case WHEN number_of_columns IS NULL THEN 0
             WHEN number_of_columns = 1 THEN 0
             ELSE 1
           END) AS multi_column
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
       (SELECT indrelid,
           max(CAST(indisunique AS integer)) AS is_unique
       FROM pg_index
       GROUP BY indrelid) x
       ON c.oid = x.indrelid
LEFT OUTER JOIN
    ( SELECT c.relname as ctablename, ipg.relname as indexname, x.indnatts as number_of_columns FROM pg_index x
           JOIN pg_class c ON c.oid = x.indrelid
           JOIN pg_class ipg on ipg.oid = x.indexrelid  )
    as foo
    ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
GROUP BY t.tablename, c.reltuples, x.is_unique
order by 2;

It took quite a bit of chocolate to wrap that up…afterwards, Selena decided that it would be neat to look at table & index sizes and see which indexes were being scanned and how many tuples fetched:

SELECT
    t.tablename,
    indexname,
    c.reltuples AS num_rows,
    pg_size_pretty(pg_relation_size(t.tablename)) AS table_size,
    pg_size_pretty(pg_relation_size(indexrelname)) AS index_size,
    CASE WHEN x.is_unique = 1  THEN 'Y'
       ELSE 'N'
    END AS unique,
    idx_scan AS number_of_scans,
    idx_tup_read AS tuples_read,
    idx_tup_fetch AS tuples_fetched
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
       (SELECT indrelid,
           max(CAST(indisunique AS integer)) AS is_unique
       FROM pg_index
       GROUP BY indrelid) x
       ON c.oid = x.indrelid
LEFT OUTER JOIN
    ( SELECT c.relname as ctablename, ipg.relname as indexname, x.indnatts as number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch,indexrelname FROM pg_index x
           JOIN pg_class c ON c.oid = x.indrelid
           JOIN pg_class ipg ON ipg.oid = x.indexrelid
           JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
    as foo
    ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
order by 1,2;



cardinality: size of the relation (”number of rows in [something]“)
selectivity: percent of the relation that’s selected
cardinality * selectivity = number of tuples in your results set.

posted by gabrielle at 5:35 pm  

Friday, April 3, 2009

Friday Happy Hour: Gimme some sugar, baby.

Time for some more fun with managing user data, of the “who was connected where and when” type. I’m going to use PostgreSQL row constructors & subqueries to filter my data.

I have a table that contains switch names & ports which are connected to other switches:
testytest=# SELECT switch_name, switch_port, connected_to
FROM switch_connections;
switch_name | switch_port | connected_to
-------------+-------------+--------------
switch-1 | 1 | switch-2
switch-1 | 2 | switch-3
switch-2 | 1 | switch-1
switch-3 | 1 | switch-1
(4 rows)

Another table contains hostnames found on each switch port at a given point in time:
(more…)

posted by gabrielle at 5:28 pm  

Friday, March 6, 2009

Friday Happy Hour: PostgreSQL & mac addresses

Postgres has a datatype just for storing mac addresses. Let’s check it out!
(more…)

posted by gabrielle at 6:41 pm  

Wednesday, December 3, 2008

Data from the PostgreSQL JTA

Data from the PostgreSQL Job Task Analysis survey.

I shortened some questions and answers to better fit on the page; see original data if you are concerned about potential shifts in meaning. In many cases, I re-ordered the answers in a way that makes sense to me. Again, please see the original data if this concerns you.  Not all data has been graphed.  Graphs created in OpenOffice Calc.  I would have rather used Gnuplot, but I preferred the horizontal bar chart style & gnuplot can’t do that yet without a lot of gymnastics.

PostgreSQL JTA – Graphs

posted by gabrielle at 7:37 pm  
Next Page »