User defined functions (UDF) to enhance the functionality of the MySQL® database server
(you heard it first on W-FUQ-ALL-YA radio...)
What's this all about?
Although MySQL is a powerful and fast database server, there are certain tasks that cannot be done with it so easily.
If you've ever tried to directly from MySQL, you'll already know what I am writing about.
In fact, these tasks can only be performed in MySQL with ugly workarounds or require the use of additional programming on the client side.
This is where UDFs (User Defined Functions) come into play. UDFs are a nice and simple way to enhance your MySQL server's functionality with custom functions.
As you've already guessed, the aim of this page is to provide you with UDFs for some common tasks that can't be done in MySQL without UDFs that easily or not at all.
The UDFs provided via this page are available under a BSD style license. The UDFs are open source and you can also use them free of charge in your projects. Before I forget to mention it, MySQL AB is not responsible at all for the contents of this page. And of course, MySQL is a registered trademark of MySQL AB in the United States, the European Union and other countries.

Please click here if you want to skip the introductionary blah blah and go directly to list of available UDFs. Or go to project page on sourceforge.net and proceed from there.
Ugly workarounds
Let's assume the following initial situation: you have a MySQL version 4.0 server running and yet no UDFs installed.

Now, try to calculate the median of a column's values with MySQL.
At the moment (that means latest stable MySQL release 4.0.20), the median can be retrieved in MySQL with these workarounds: It seems that the median can not be calculated with a single query with current stable versions of MySQL.
Wouldn't this be nice?
The most preferred solution would be to be able to retrieve the median of a column together with other statistical data in a single query, e. g.:
 SELECT MIN({column name}) AS min, MAX({column name}) AS max, AVG({column name}) AS avg, STDDEV({column name}) AS stddev, COUNT({column name}) AS count, MEDIAN({column name}) AS median FROM {table name}
Why don't you do so?
In fact, you can. If you've read the MySQL manual, you'll know that you can extend the functionality of your MySQL server by supplying UDFs.
So, the only thing to do is to add a median() function to MySQL via a UDF.
There's a section in the MySQL manual named 23.2 Adding New Functions to MySQL that describes the most important things about UDFs. If you have any problems with UDFs, please look into the manual first.
Briefly described, you can write a C module that conforms with the MySQL UDF interface. If you issue a query using a custom function in MySQL, your C module will be used. First, you have to implement some kind of data type checking in your module, so that it ensures your function is called with the right data types (e. g. no STRING instead of a REAL).
After the input data passed your initial checking, a specific function in your module will called by MySQL for each row. In this function, you can do whatever you want with the input data (e. g. adding, sorting, concatenating). After that, you just write back the result. By the way, you can write aggregate and non-aggregate UDFs.
What do you have to do?
There are only a few steps you have to follow (all examples given are for the median UDF; of course you have to adjust all paths to your specific situation):
If you want to get rid of a UDF you have registered successfully, simply run mysql and type
mysql> DROP FUNCTION median;
Caveats
License
All UDFs provided here are subject to the following licensing conditions:
Copyright (c) 2004, Jan Steemann
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
List of available UDFs
The following list states which UDFs are available at the moment. The download location for all UDFs is http://prdownloads.sourceforge.net/mysql-udf/mysql-udf-0.3.tar.gz?download.
UDF median()
Function name median
Purpose calculate the median of a column
Aggregate function Yes
Input parameter(s) 1 (column: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT MEDIAN(age) FROM bar
SELECT gender, MEDIAN(age),AVG(age) FROM bar GROUP BY gender
Registration CREATE AGGREGATE FUNCTION median RETURNS REAL SONAME 'udf_median.so';
UDF correlation()
Function name correlation
Purpose calculate the correlation coefficient of the linear regression of two sets of variables
Aggregate function Yes
Input parameter(s) 1 (dependent variable: REAL,
independent variable: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT correlation(income,age) FROM customers
Registration CREATE AGGREGATE FUNCTION correlation RETURNS REAL SONAME 'udf_correlation.so';
UDF intercept()
Function name intercept
Purpose calculate the intercept of the linear regression of two sets of variables
Aggregate function Yes
Input parameter(s) 2 (dependent variable: REAL,
independent variable: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT intercept(income,age) FROM customers
Registration CREATE AGGREGATE FUNCTION intercept RETURNS REAL SONAME 'udf_intercept.so';
UDF slope()
Function name slope
Purpose calculate the slope of the linear regression of two sets of variables
Aggregate function Yes
Input parameter(s) 2 (dependent variable: REAL,
independent variable: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT slope(income,age) FROM customers
Registration CREATE AGGREGATE FUNCTION slope RETURNS REAL SONAME 'udf_slope.so';
UDF skewness()
Function name skewness
Purpose calculate the skewness of a distribution of values
Aggregate function Yes
Input parameter(s) 1 (data: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT skewness(age) FROM customers
Registration CREATE AGGREGATE FUNCTION skewness RETURNS REAL SONAME 'udf_skewness.so';
UDF kurtosis()
Function name kurtosis
Purpose calculate the kurtosis of a distribution of values
Aggregate function Yes
Input parameter(s) 1 (data: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT kurtosis(age) FROM customers
Registration CREATE AGGREGATE FUNCTION kurtosis RETURNS REAL SONAME 'udf_kurtosis.so';
UDF confidence_higher()
Function name confidence_higher
Purpose calculate the upper bound of the confidence interval for a given standard deviation, sample size, mean and confidence probabilty
Aggregate function No
Input parameter(s) 4 or 5 (confidence probability p: INT or REAL,
sample size n: INT,
arithmetic mean my: INT or REAL,
standard deviation s: INT or REAL,
number of decimals: INT, optional)
Conditions -
Output value(s) 1 per row (REAL)
Examples SELECT confidence_higher(0.95,2000,550,60,4)
Registration CREATE FUNCTION confidence_higher RETURNS REAL SONAME 'udf_confidence_higher.so';
UDF confidence_lower()
Function name confidence_lower
Purpose calculate the lower bound of the confidence interval for a given standard deviation, sample size, mean and confidence probabilty
Aggregate function No
Input parameter(s) 4 or 5 (confidence probability p: INT or REAL,
sample size n: INT,
arithmetic mean my: INT or REAL,
standard deviation s: INT or REAL,
number of decimals: INT, optional)
Conditions -
Output value(s) 1 per row (REAL)
Examples SELECT confidence_lower(0.95,2000,550,60,4)
Registration CREATE FUNCTION confidence_lower RETURNS REAL SONAME 'udf_confidence_lower.so';
UDF stdnorm_density()
Function name stdnorm_density
Purpose calculate the value of the density function of the standard normal distribution for a given point
Aggregate function No
Input parameter(s) 1 or 2 (point p: INT or REAL,
number of decimals: INT, optional)
Conditions -
Output value(s) 1 per row (REAL)
Examples SELECT stdnorm_density(0,4)
Registration CREATE FUNCTION stdnorm_density RETURNS REAL SONAME 'udf_stdnorm_density.so';
UDF stdnorm_dist()
Function name stdnorm_dist
Purpose calculate the value of the distribution function of the standard normal distribution for a given point
Aggregate function No
Input parameter(s) 1 or 2 (point p: INT or REAL,
number of decimals: INT, optional)
Conditions -
Output value(s) 1 per row (REAL)
Examples SELECT stdnorm_dist(0,4)
Registration CREATE FUNCTION stdnorm_dist RETURNS REAL SONAME 'udf_stdnorm_dist.so';
UDF geomean()
Function name geomean
Purpose calculate the geometric mean of a column
Aggregate function Yes
Input parameter(s) 1 (data: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT geomean(price) FROM products
Registration CREATE AGGREGATE FUNCTION geomean RETURNS REAL SONAME 'udf_geomean.so';
UDF weightedavg()
Function name weightedavg
Purpose calculate the weighted average of a values
Aggregate function Yes
Input parameter(s) 2 (data: REAL)
(weight: REAL)
Conditions -
Output value(s) 1 per group (REAL)
Examples SELECT weightedavg(voting,categoryweight) FROM votes
Registration CREATE AGGREGATE FUNCTION weightedavg RETURNS REAL SONAME 'udf_weightedavg.so';
UDF noverm()
Function name noverm
Purpose calculate the number of possible combinations of size m from a set of n total values
(without regarding the order of values in combinations)
Aggregate function No
Input parameter(s) 2 (n: INTEGER,
m: INTEGER)
Conditions n must be greater than 1, m must be between 1 and m
Output value(s) 1 (REAL)
Examples SELECT noverm(10,2)
SELECT noverm(foo,bar) FROM baz
Registration CREATE FUNCTION noverm RETURNS INTEGER SONAME 'udf_noverm.so';
UDF faculty()
Function name faculty
Purpose calculate the faculty (n!) of a number (1*2*3*4*5*6...*n)
Aggregate function No
Input parameter(s) 1 (n: INTEGER)
Conditions n must be greater than 0
Output value(s) 1 (REAL)
Examples SELECT faculty(20)
SELECT faculty(foo) FROM bar
Registration CREATE FUNCTION faculty RETURNS REAL SONAME 'udf_faculty.so';
UDF colwidth()
Function name colwidth
Purpose calculate the length of the longest value in a STRING column
Aggregate function Yes
Input parameter(s) 1 (column: STRING)
Conditions -
Output value(s) 1 per group (INTEGER)
Examples SELECT COLWIDTH(request_vars) FROM request_log
SELECT session.id,COLWIDTH(session_vars.value) FROM session,session_vars WHERE session.id=session_vars.session_id GROUP BY session_id
Registration CREATE AGGREGATE FUNCTION COLWIDTH RETURNS INTEGER SONAME 'udf_colwidth.so';
UDF longest()
Function name longest
Purpose get the longest value in a STRING column
Aggregate function Yes
Input parameter(s) 1 (column: STRING)
Conditions -
Output value(s) 1 per group (STRING)
Examples SELECT LONGEST(name) FROM users
SELECT LONGEST(request_vars) FROM request_log GROUP BY script_name
Registration CREATE AGGREGATE FUNCTION LONGEST RETURNS STRING SONAME 'udf_longest.so';

SourceForge.net Logo    Valid XHTML 1.0!