Pain will teach you that …

  • SELECT * is bad if you don’t know how large the amount of data can be you get back.
  • … doing a while loop over an unknown amount of data from said SELECT * is even worse.
  • … doing this while some end user is waiting for the application to respond while the application happily runs in a loop over an unknown amount of data and calls a remote system (an waits for answer) for each data set is kinda suicidal. For your system and also probably the remote system.
  • … tricking the application into NOT deleting data that was already transmitted to the remote system and trying to send it all again on the next run is what you might call a self regulating system. Your application will get slower from call to call and eventually is gets so slow that people stop using the feature in question. Mission accomplished, system saved.

There’s also the usual variation like: end user waits for a page to load while the application tries to load an unknown amount of data from a remote system BEFORE delivering the page. End user thinks page does not load, clicks again and again and triggers a new data fetching call on each click.

Lessons to be learned:

  • Unless you know what you get back, don’t use SELECT *, only fetch from each row of data what you need. If there’s the chance you get too much add a LIMIT before you choke.
  • Don’t do loops without adding an emergency brake, bail out if a maximum execution time or some maximum count of cycles is reached.
  • Had I only had a unit test in the right place ;)
  • Add some fail counter on to be transmitted data, if the remote application does not like it after n retries forget trying to send it, it only lessens the chance to get the valid stuff transmitted.

I learned some equivalent of that lesson long ago. But somehow that’s a kind of booby trap every developer somehow has to step into at some point. Read, remember and step around that trap ;)

Posted in coffee, heavy metal (e.g. lead), PHP | Leave a comment

Kniffeliges OOP – Part 1

The Plan

My friend is quite fond of a game named ‘Kniffel’ or ‘Yahtzee’ and as we are seldom in the same place, I started thinking about a way to play it via internet. Of course I want to write the application myself (if not where’s the fun for me?) and take the chance to report about the progress or stuff I think might be also of interest to others.

In the beginning there’s always a user story:

  • We want to play at dices with n players , n>=2 and n <= 8 (at least that's the standard).
  • We want to play on different computers while being in different places. No Internet is a no-go anyway.
  • The game rules are the standard rules of Kniffel (for later we might think about extending it with user defined rules).
  • The game must be able to pause in between and resume where stopped the last time.
  • The code will be mostly PHP with some javascript for the GUI.

Now read this Wikipedia article and try to get the idea of object-oriented programming, cause that’s what we’ll do now.

We look at the game how it is in real life and then try to model the computer code after the real thing. So once you have the user story you first need a rough sketch about the structures before starting to code.


What do we need to start:

  • 5 dices
  • a dice cup
  • two or more users
  • a sheet of paper to track the game status
  • a set of rules
  • the GAME that holds all of these together

We first implement the core structure of the game, the nice gui stuff comes later on top of an already working though basic application. We might decide to interact with the game via browsers (the most possible solution) or could even think about some fancy way of sending mails back and forth. What we display to the end user at a later point has nothing to do with the game core. We will certainly not hard wire anything about the gui stuff into the game core.


The game (object)

  • holds 2 to 8 users
  • has a cup
  • has 5 D6 dices
  • has a set of rules
  • has a sheet of paper where all results are marked down
  • knows whose turn it is
  • gets nasty when someone tries to cheat ;)

The user (object)

  • has a unique user id
  • is told when it is his turn to play
  • is given the cup
  • is given the dices
  • can put dices into cup
  • can roll the dices
  • decides to keep dices fixed and roll the remainder again up to 3 times.
  • can decide to make ‘rule objects’ from the dice results (kniffel, full house, etc.)

The cup (object)

  • can be filled with dices
  • can roll all dices it holds
  • can show the dices result
  • dices can be removed

Note: the cup object is unaware of which user currently holds the cup or what
kind of dices and how many it contains. It does its part and that’s it.


The D6 dice (object):

  • is of type D6, that means it has 6 faces with values 1-6
  • can be rolled
  • holds its current face until rolled again
  • knows if it was looked at after rolling

Memo to self: beware of random numbers and statistics here, at least check the distribution in a unit test. There was something about random seeds and pseudo random numbers.

Note: The dice is unaware of which user currently holds the cup or how many other dices are in the cup. It does it’s part and that’s it.


A little more on dices:

I’m bored to death when having to implement the same code slightly altered over and over again. Reusing classes spares you a lot of time. We could want to play a game with different rules and D20 later, but this will also be a game that holds users, rules though different, sheet of paper, cup and dices.
The cup does not care about how many dices it holds and if the return values range from 1 to 6 or from bla to foo. So we will generate the dices in a factory that gives us D6 for the Kniffel game but might later be extended to give us D20 as well.
To make sure the cup gets what it needs, we define a common interface for the dices that all dices must implement.

The Dice factory:

  • is told we want a W6 with ID 1
  • might also be told we want a D20 with values 1 to 20
  • generates the correct dice object for us

Btw, we just started using interfaces and design patterns ;)


Hope you get the idea, same way as cup and dices are defined, we will define the set of rules, ‘sheet of paper’ etc.

Different persons could now start to implement for example cup and dices. The cup developer needs to know about the public interface the dice will have, as he needs to call this functionality in the dice object. But for developing the cup, he does not need to wait until the dice class is completely implemented. Remember the cup is ‘given the dices’? That is dependency injection, the cup can get mocked or stubbed dices for unit tests, so the cup can be completed even if the dice is not yet finished.

If you do this (and you will do this on a larger scale at some point), make sure you run some full integration tests on the whole application later. Happens too often that someone changes e.g. the return value of a public function without remembering that the calling class expects an array and not the boolean it gets now. If you have mocked the return value in all the related unit tests, you are in big trouble. Your tests are all ok but the application is not behaving as expected, yeah.

Oh, before my boss might stumble on this and starts complaining I did not mention: use TDD (test driven development), then you always end up with a completely tested application.

Enough for now, guess what I’ll start doing right now :p

p.s.: example code will be seen in later blog posts, stay tuned.

Posted in OOP, PHP | Tagged , , , | Leave a comment

OXID eShop CE 4.6.0

In OXID eShop from a different point of view I described how to run OXID eShop Community Edition without database views. The obstacles of a cheap webpack ;)

Time for an update, as OXID Community Edition 4.6.0 is out now. Info about the new features can be found here. The downloadable products are pretty cool.
And hey, Hosteurope WebPack L 3.0 now allows database views, so setting up the shop goes without any nasty hacks in just a minute. My testshop now runs on PHP Version 5.2.17 with MySQL 5.5.22. Upgrade to PHP 5.3.will be done soonish.

Btw, if you want to connect the shop to OXID eFire, you need to set file permissions for core directory to 770 because that’s where the file oxefi.php is written when downloading it via admin backend.

Nasty shop, it is very insistent on deleting the setup directory after I switched the shop to live mode :D

Don’t get any article shown in admin backend after fresh shop installation and installing demo data? Articles are definitely there, I installed test data and successfully finished a test order. Cleaning tmp folder does not help. In Shop admin section service->tools there is a button named ‘Update VIEWS now’, click it and you’ll finally see your article data in the admin area as well.

Posted in oxshop, PHP | Tagged , , , , , | Leave a comment

MySQL – The Name of the Table

Beware when renaming a table if there are other tables around that have foreign keys on the renamed table.

Following case: we want to exchange some table against a better structured one but for some reason do not want to run ALTER TABLE on the original table. So the way out is to create a new table with the new structure, transfer all the necessary data from the original table into it and exchange the rename the tables by renaming them. You will need this when having to alter large tables in a 24/7 running application when you cannot afford running hours long ALTER TABLE queries. See MySQL documentation, ALTER TABLE will block write access until table structure is changed and that can take a long time for large tables.

Example code:test setup

Now we do the following:

CREATE TABLE data_base_new (
id INT(11) NOT NULL AUTO_INCREMENT COMMENT 'the id',
data VARCHAR(128) NOT NULL DEFAULT '' COMMENT 'the data',
more_data VARCHAR(128) NOT NULL DEFAULT '' COMMENT 'some more data',
last_updated TIMESTAMP ON UPDATE CURRENT_TIMESTAMP DEFAULT '0000-00-00 00:00:00' COMMENT 'date of last update',
PRIMARY KEY ( id ),
INDEX `idx_last_updated` ( last_updated )
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO data_base_new ( id, data, last_updated ) SELECT a.id, a.data, a.last_updated FROM data_base a;

Switch from old to new table:

RENAME TABLE data_base TO data_base_old;
RENAME TABLE data_base_new TO data_base;

If you did not remember the other table data_extended until now that had foreign keys on
table data_base, you will notice when trying to DROP TABLE data_base_old .

ERROR 1217 (23000): Cannot delete or update a parent row: a foreign key constraint fails


SHOW CREATE TABLE data_extended;

explains why:
SHOW CREATE TABLE

The foreign key constraints stick to their original table, no matter if you change the name. Nasty one if you now are running your application with the wrong foreign key constraint ;)

I did not find another way until now as to drop the foreigns keys, then rename the tables and then set the foreign keys again. Or duplicate all related tables, not only the master table.

If you need to know: No, we did not frak our live database up, but we sometimes think about what we night need in the future and this was part of finding out bad things beforehand :)

Posted in howto, mysql | Leave a comment

MySQL hates you …

… if it is badly treated ;)

This post will show something about MySQL locks, transactions and autocommit.

Before we start, here’s some useful stuff to know:

  • You get the MySQL connection id with SELECT CONNECTION_ID();
  • You get information about use status of tables with SHOW OPEN TABLES FROM the_database;

And here’s the test setup for database and table:

The test setup

Open two tabs in a terminal and connect to MySQL in each of them. We will name those connections Acon and Bcon. Execute SELECT CONNECTION_ID(); and you will see that both connections have different ids.


LOCKS

Acon:
LOCK TABLE data_base WRITE;
INSERT INTO data_base ( data ) VALUES ( 'hurz' );
SHOW OPEN TABLES FROM the_database;

You will see that table data_base is now marked as in use. The new data is inserted into table data_base and has id 5.

Bcon:
INSERT INTO data_base ( data ) VALUES ( 'asdf' );

… and while you wait, grab some lunch, get a coffee, have chat with colleagues …
Forget it, the other connection holds the write lock on that table.

Acon:
UNLOCK TABLES;

And in Bcon the pending insert is immediately executed and data is in there with id 6 now. Write lock is also released when the connection holding the lock crashes (worksforme locally).


AUTOCOMMIT

Acon:
SET AUTOCOMMIT = 0;
INSERT INTO data_base ( id, data ) VALUES ( '5', 'hurz' );
SELECT id, data FROM data_base ORDER BY id DESC LIMIT 1;

Answer: ’5′, ‘hurz’

Bcon:
SELECT id, data FROM data_base ORDER BY id DESC LIMIT 1;

Answer: ’4′, ‘wahoo’. Means: Acon sees its own inserted data but as autocommit is turned off, Bcon does not yet see the new entries.

Bcon:
INSERT INTO data_base ( id, data ) VALUES ( '5', 'qwer' );

… and while you wait, grab some lunch, get a coffee, have chat with colleagues …
And when you waited long enough, the answer is

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Problem here is: Acon already inserted data for key ’5′, but as autocommit is turned off, that data can now yet be seen by Bcon. But as Bcon tries to insert data with an already existing id, it gets stuck and waits. Good thing it has some timeout set, bad thing is the data might not be inserted.

Try again to insert data via Bcon, you’ll start waiting again. As soon as you turn auto commit on in Acon, Bcon tries to execute the query and gives an error message:

ERROR 1062 (23000): Duplicate entry ’5′ for key ‘PRIMARY’

What do we learn from here?
Do a full reset and do the same again but when inserting, do not explicitly set the primary key:

INSERT INTO data_base ( data ) VALUES ( 'hurz' );
SELECT LAST_INSERT_ID(); => answer is 5

In Bcon (that still only sees 4 entries) now do:

SELECT LAST_INSERT_ID(); => answer is 0 as there was nothing yet inserted.
INSERT INTO data_base ( data ) VALUES ( 'qwer' );
SELECT LAST_INSERT_ID(); => answer is 6, query was executed immediately.
SELECT * FROM data_base; => you see id 1 to 4 and 6, but no entry yet with id 5.

The ‘not yet visible for other connections’ insert from Acon can still not be seen in Bcon but as no insert was done with supplying an explicit primary key, the insert did not block the insert in Bcon.

Connections might block each other if they both try to insert data for the same primary key. This will lead to slower query execution or maybe even data loss.


TRANSACTIONS

When you have a bunch of queries that should be executed together or not at all, start a transaction, execute the queries, if all is well say: commit, if there’s a problem do rollback.

Note: transactions can not be used for all kinds of queries. If you have one that affects table structure, like DROP/ADD KEY, DROP/ADD COLUMN, RENAME …, when called inside a transaction will trigger an autocommit.

Acon:
START TRANSACTION;
INSERT INTO data_base ( id, data ) VALUES ( '5', 'hurz' );

Bcon cannot see this inserted data yet, as transaction is still pending. Trying to insert data with the same primary key in Bcon gives us a query that fails with

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

as we could expect.

The transaction in Acon can be ended by either COMMIT or ROLLBACK. If the connection crashes while a transaction is still in progress, no autocommit is triggered. All inserts that were done inside the transaction in Acon are lost in case of Acon crashing while the transaction is open.

If you do not explicitly supply the primary key when inserting data, you will notice that your inserts in Bcon get primary keys that are not yet taken by inserts that were done in not yet finished transactions in Acon.
So far as you don’t insist on choosing the primary keys in inserts, you will not have any inserts forced to wait until other connections finished their transactions.

Posted in howto, mysql | Leave a comment