Posts tagged ‘Java Searching Data’

Java Tutorial – Sorting and Searching Data

We also have some excellent Beginners Programming Java tutorials that use high quality narrated videos and practical working files to teach the fundamentals of programming in Java Beginners Java Tutorial CD

.

Java Tutorial – Sorting, Searching and Recursion

Once you start to use data structures in Java, you will inevitably encounter situations in which you need to sort or search through them. For the purposes of this beginners Java tutorial, we will use arrays to illustrate some basic sorting, searching and recursive algorithms.

An algorithm is a procedure, a series of steps that you define to solve a particular problem. In the examples below, our algorithms will be expressed in the Java code that we use in our programs.

Recursion

Some of the techniques we will be using rely on recursive processes, so let’s explore these first. A recursive method in your program is basically one that calls itself, i.e. within the method body, there will be a call to the method itself. Using recursive methods effectively can take a bit of practice, as your program can easily get caught in a loop, so you need to think the procedure in any such method through carefully.

To demonstrate a recursive function in action, let’s create a trivial example. Create a new Java project and enter the following in your main class, after the main method:

//recursive method
public static int doSomething(int someNumber)
{
		//check the value of the input parameter
	if(someNumber<=0)
		return someNumber;
	else
	{
			//decrement and call the method again
		someNumber--;
		System.out.println(someNumber);
		return(doSomething(someNumber));
	}
}

Now call the new method within your main method:

int result = doSomething(10);
System.out.println("done: " + result);

All the program does is count down from the passed int value to zero. Experiment by changing the value of the int parameter passed to the method and observe what effect it has on the output printed to the console. The method manages to avoid getting stuck in a loop (caused by calling itself repeatedly) by having the conditional ‘if’ statement.

This simple recursive method has much the same effect as a loop, but, as you’ll see, recursion can be particularly useful when you have a complex problem that is best solved in stages, each stage taking you a step closer to the solution by simplifying it. A necessary ingredient for recursive methods then is that you specify explicitly how the method should behave with the simplest input, and then with each iteration, you work towards this simplest case. In the example above, the simplest case is when the int parameter is less than or equal to zero.

Try not to worry if this seems an odd practice at first, once you start using sorting methods the recursive approach will become clearer.

Sorting

The main reason for keeping a collection of data in sorted order is that it is far easier to find items within the data and therefore maintain it effectively. Keeping the data in sorted order does, however, involve a certain amount of processing, particularly when adding or deleting items.

When we say, for example, that an array is in sorted order, we mean that the items within the array are arranged according to some understood ordering system. Sometimes this is intuitive, for example if the array contains numbers, sorted order will naturally comprise numbers stored according to ascending (or sometimes descending) numerical value. Similarly, if the array contains String values, sorted order will generally mean alphabetical order.

However, there are cases when sorted order will be less clear – imagine you have created your own class declaration for use within your program, and you wish to store objects of your class in the array. You will have to decide then what constitutes sorted order for these objects. In order to do this you have to think about comparing two objects of the same class. When you compare two objects of the class, it must be that case that either one is ‘greater’ than the other, or they are ‘equal’. This is how your objects will be ordered, and to provide this in your own classes they will need to implement the Comparable interface, which means that your objects provide a compareTo method. Within this method you will need to decide how to measure the objects of your class against one another, for example, you might decide to use the value of some variable in the class.

Note: be careful when using the ‘equals’ method on your objects also, as the default equals method inherited from the Class Object, actually tests the object references, i.e. it tests to see whether the two variable names are pointing to the same object. If you plan to use the equals method, you would be best to provide your own implementation of the method, overriding the default for this reason.

For the purposes of this Java tutorial, we will stick to simple types in order to illustrate the algorithms required.

Selection Sort

Selection sort works by continually finding the next smallest item in the collection and placing it at the front. Create a new program and enter the following code in your main method to create an array with some arbitrary data in it:

	//array with unsorted int data
int [] myData = {3, 1, 9, 5, 7};

Now we will use selection sort to arrange the items in the array in ascending numerical order. The selection sort technique is as follows:

  • Within the section of data that is yet to be sorted (initially the whole structure), find the smallest item
  • Place this item at the first position (of the unsorted section), swapping it with any element that’s already there
  • The unsorted section now contains one less item (the item just deemed smallest and therefore moved)
  • Continue placing the smallest unsorted item in first position until all of the data has been sorted

Enter the following code after declaring the array and compile your program, observing its output:

	//loop through the array
for(int pos=0; pos<myData.length-1; pos++)
{
		/* find and keep track of the smallest
		 * element's position
		 * -start at pos and look through the rest
		 * (pos is at the start of the unsorted section)
		 */
	int minIndex = pos;

		//loop through the unsorted section
	for(int next=pos+1; next<myData.length; next++)
	{
		if(myData[minIndex]>myData[next])
			minIndex=next;
	}

		//swap the item if necessary
	if(minIndex!=pos)
	{
			//keep track of the value currently at pos
		int temp = myData[pos];
			//copy the new smallest value into pos
		myData[pos] = myData[minIndex];
			//copy the swapped value into minIndex
		myData[minIndex] = temp;
	}

		//print the array for demonstration
	for(int p : myData)
	{	System.out.print(p);	}
	System.out.println();
}

When you look at the program output, try to work through which items are being swapped with one another at each stage – experiment by changing the int values (or their order) in the array.

Merge Sort

A more efficient approach to sorting the contents of an array is Merge Sort, which uses a recursive algorithm. This rests on the idea that merging two already sorted arrays is a simpler problem than sorting all of the data in one unsorted array. The algorithm therefore sorts an unsorted array by continually dividing it into two parts and sorting each part, merging the results into a final sorted array.

Add the following method to your program (after the main method):

/* mergeSort method sorts the data in a section of
	 * the array passed
	 * -parameters are the array containing the data to
	 *  be sorted, a temporary array for use with the
	 *  processing, and integers representing the start
	 *  and end positions of what data is to be sorted
	 *  within the array
	 */
public static void mergeSort(int[] dataArray, int[] tempArray,
							int start, int end)
{
		//array has been sorted
	if(start==end)
		return;
		//work towards sorted array by breaking the problem down
	else
	{
			//divide the data into two
		int center = (start+end)/2;
			//call the method on the first half
		mergeSort(dataArray, tempArray, start, center);
			//second half
		mergeSort(dataArray, tempArray, center+1, end);

			//merge the two halves into one sorted half
		int firstIndex = start;
		int secondIndex = center+1;
		int thirdIndex = start;

			//loop through both halves
		while(thirdIndex<=end)
		{
				/* work through both halves
				 * inserting smallest item into
				 * the temp array each time
				 */
			if(secondIndex>end ||
			(firstIndex<=center &&
			dataArray[firstIndex]<=dataArray[secondIndex]))
			{
				tempArray[thirdIndex]=dataArray[firstIndex];
				thirdIndex++;
				firstIndex++;
			}
			else
			{
				tempArray[thirdIndex]=dataArray[secondIndex];
				thirdIndex++;
				secondIndex++;
			}
		}
			//copy the temp contents into the main array
		for(int i=start; i<=end; i++)
			dataArray[i]=tempArray[i];
	}
}

Each time the method reaches the point of merging two halves, each half has already been sorted, since the mergeSort method has been first called on them, and on their subsections before they come to be merged, and so on. The method therefore solves the problem by repeatedly simplifying it and solving the simpler problems in turn. Now call the new method from your main method:

	//array to sort
int[] someData = {4, 3, 7, 6, 2, 8, 5, 9, 1};
	//extra helper array
int[] extraArray = new int[someData.length];
	//call the mergesort method on the whole array
mergeSort(someData, extraArray, 0, someData.length-1);

	//print the sorted array for testing
for(int pr : someData)
{ System.out.print(pr); }

Again, experiment by changing the values in the input array and observing the program output.

It can be difficult to get into the ‘recursive’ way of thinking and certainly requires practice, but once you do you will tend to find that you identify problems that could benefit from a recursive solution fairly quickly. It can help when writing a recursive method if you start from the simplest case, then take care of the steps that work towards it each time the method executes (rather than starting from the most complex input, which is the first one to be executed, and can therefore seem an intuitive starting point).

Searching

You will often face the task of finding a particular element in a data structure for one reason or another. If your array data is unsorted, the only way to do this is to look through each of the elements in turn. This is known as linear search – enter the following code in your main method:

//collection of arbitrary strings
String[] myStrings = {"potato", "carrot", "turnip",
		"onion", "leek", "courgette", "aubergine"};
	//find the index of 'leek'
int foundIndex=-1;
	//loop through the array
for(int veg=0; veg
	//binary search method
public static int binarySearch(String[] theData, String wanted)
{
		//-1 means item not found
	int foundIndex=-1;

		//keep track of both ends of searchable area
	int top = theData.length-1;
	int bottom = 0;

		//loop through, dividing searchable area each time
	while(bottom<=top)
	{
			//look at middle element
		int middle = (top+bottom)/2;

			//compare middle element to item searched for
		int comp = theData[middle].compareTo(wanted);

			//middle is item searched for
		if(comp==0)
			return middle;

			/* if middle element is greater,
			 * wanted element must be in lower half
			 * and vice-versa, so we reduce the
			 * searchable area accordingly
			 */

		else if(comp>0) //middle is greater
			top=middle-1;
		else //middle is lesser
			bottom=middle+1;

	}
		//finished searching
	return foundIndex;
}

Now call the method within your main method:

//array of sorted strings
String[] sortedStrings = {"aubergine", "carrot", "courgette",
		"leek", "onion", "potato", "turnip"};

	//call the method - returns -1 if item not found
int searchResult = binarySearch(sortedStrings, "potato");

System.out.println("found at index: "+searchResult);

Experiment with the code by searching for different elements in the array – you can also check how many iterations of the loop are required each time by including a System.out.print statement within the while loop in your binarySearch method. It is generally a good practice to include such ‘trace statements’ in your code, to check exactly what is happening at any stage in the program; it can be particularly helpful when debugging larger programs.

Although many of Java’s built-in datatypes provide sorting and searching methods, it can help with the efficiency of your own programs to think about the algorithms involved. Also, when you come to define your own data types, you may need to implement such methods yourself. There are numerous approaches to sorting and searching in Java, the examples above being only an introduction.

Advertisement

January 7, 2009 at 6:32 pm Leave a comment