Posts Under Tag: GitHub

Updating a Forked GitHub Repository

If you’re going to contribute to an existing project on GitHub, your first step is to fork the repository.  The easiest way to do that is to look in the upper-right of the repo’s GitHub page and press the Fork button.  Doing this will fork, or create a copy of the repository under your GitHub account.

Fork a repo from GitHub via the browser

Fork a repo from GitHub via the browser

So you now own a copy of that repository.  What happens if the Master branch of the parent repository you forked from gets updated and your fork is now out of date?  Well, you may first start by reading the GitHub documentation on Syncing a Fork.  The instructions start by setting a remote upstream location.

$git remote add upstream https://github.com/otheruser/repo.git

You can think if this as instructing your fork as to where it came from and where it should update from.  Now that your fork knows where it needs to update from, you have to actually fetch that code.

$git fetch upstream

At this point you have your local fork and the upstream repository.  The GitHub documentation states you should then merge with the upstream/master.  However, if you do that and make a change, commit it, and then submit a pull request, what your pull request would look like would be your changes plus all the commits merged in from the origin.  Luckily this Stackoverflow answer provides an alternative, continuing with a rebase.

git rebase upstream/master

The rebase rewrites your master branch so that any of your commits that aren’t in the upstream/master are put on top of that branch.  This eliminates the problem where your merge with the origin would include all of your merging activity.  Finally, after the rebase on to the upstream you’ll need to force push this all to your fork on GitHub.

git push -f origin master

As noted on the Stackoverflow answer, the -f is only needed the first time after you’ve rebased.  At this point your own fork of the repository on GitHub should now be up to date with the repository you forked it from.

Tags: , Filled Under: Programming Posted on: September 20, 2013

My First GitHub Pull Request with LinqToExcel

I recently found myself in a situation dealing with a web application that  needed the ability to allow an administrator-type user upload an Excel list of records and then process that list.  What I mean by process the list is to add records from the Excel file that weren’t already in the database, update records that were different in the Excel file than what was in the database, and mark for deletion any records that were not in the Excel but were in the database.

Letting a user upload an Excel file that in turn modifies data records has always been a task that I’ve had reservations about.  There are so many variables to account for, not the least of which is hoping the user uploaded the correct Excel file in the correct format.  From a technical standpoint, performing an Excel import and the resulting data augmentation has always been sort of dicey too.  However, in order to help me out I thought to use a library that I had used before in another project named LinqToExcel.

As stated succinctly on the project’s GitHub page:

Linq to Excel is a .Net library that allows you to query Excel spreadsheets using the LINQ syntax.

To get started, I reviewed LinqToExcel’s documentation.  It’s actually quite good; so good that I’m not going to bother rehashing it but will provide some resources:

Clicking on those links it becomes obvious that LinqToExcel is simple to use with hardly any learning curve.  Basically, if you know how to use Linq, you can pick up LinqToExcel in a few minutes.

Now to the meat of this blog post: If LinqToExcel is such a great and easy to use tool to query Excel files, why would I find a need to change it?  LinqToExcel has a query property named StrictMapping.  The original implementation of StrictMapping was a boolean property that enforced the Excel worksheet’s columns to match the properties of the target class and vice versa.  If there was a mismatch, an exception is thrown.  For instance, the following situation would pass a StrictMapping enforcement check:

Excel Column Class Property StrictMapping Result
FirstName FirstName ok
LastName LastName ok
Title Title ok

For my situation, the StrictMapping check proved to be an excellent way to verify the user was uploading an Excel worksheet that contained the correct columns.  It’s one of the ways to validate the consumer’s data in the Excel worksheet.  However, in my scenario there were extra columns in the Excel worksheet that I wanted to ignore because they were not properties of my target class.  Because StrictMapping saw a mismatch, an exception was thrown.  For instance, here is a depiction of an Excel worksheet with the added column HireDate and the target class without that property:

Excel Column Class Property StrictMapping Result
FirstName FirstName ok
LastName LastName ok
Title Title ok
HireDate [none] Exception

By not being able to set StrictMapping in this situation, I was losing out on the enforcement it provided to verify the properties in my target class had Excel worksheet data matching it.  I determined that I needed to modify this property in order to make it work for me and luckily LinqToExcel is open source!

I began by forking the project on GitHub and changing the StrictMapping property from a boolean to a nullable enumeration:

public StrictMappingType? StrictMapping { get; set; }

The enumeration has values for multiple new conditions because I figured if I was going to expand StrictMapping, I may as well cover all possible scenarios:

namespace LinqToExcel.Query
{
    /// <summary>
    /// Class property and worksheet mapping enforcemment type.
    /// </summary>
    public enum StrictMappingType
    {
        /// <summary>
        /// All worksheet columns must map to a class property; all class properties must map to a worksheet columm.
        /// </summary>
        Both,

        /// <summary>
        /// All class properties must map to a worksheet column; other worksheet columns are ignored.
        /// </summary>
        ClassStrict,

        /// <summary>
        /// No checks are made to enforce worksheet column or class property mappings.
        /// </summary>
        None,

        /// <summary>
        /// All worksheet columns must map to a class property; other class properties are ignored.
        /// </summary>
        WorksheetStrict
    }
}

I chose to have the StrictMapping property a nullable enumeration by design.  Being the property was previously a boolean, there was logic in place to default the value to False if the property wasn’t explicitly set.  I modified that logic to similarly set the value to StrictMapping.None by default (see line 7 below):

internal ExcelQueryArgs(ExcelQueryConstructorArgs args)
{
	FileName = args.FileName;
	DatabaseEngine = args.DatabaseEngine;
	ColumnMappings = args.ColumnMappings ?? new Dictionary<string, string>();
	Transformations = args.Transformations ?? new Dictionary<string, Func<string, object>>();
	StrictMapping = args.StrictMapping ?? StrictMappingType.None;
}

Finally, I had to update the logic that determines if the StrictMapping condition was met by using the value set to the StrictMapping enum:

private void ConfirmStrictMapping(IEnumerable<string> columns, PropertyInfo[] properties, StrictMappingType strictMappingType)
{
	var propertyNames = properties.Select(x => x.Name);
	if (strictMappingType == StrictMappingType.ClassStrict || strictMappingType == StrictMappingType.Both)
	{
		foreach (var propertyName in propertyNames)
		{
			if (!columns.Contains(propertyName) && PropertyIsNotMapped(propertyName))
				throw new StrictMappingException("'{0}' property is not mapped to a column", propertyName);
		}
	}

	if (strictMappingType == StrictMappingType.WorksheetStrict || strictMappingType == StrictMappingType.Both)
	{
		foreach (var column in columns)
		{
			if (!propertyNames.Contains(column) && ColumnIsNotMapped(column))
				throw new StrictMappingException("'{0}' column is not mapped to a property", column);
		}
	}
}

private bool PropertyIsNotMapped(string propertyName)
{
	return !_args.ColumnMappings.Keys.Contains(propertyName);
}

private bool ColumnIsNotMapped(string columnName)
{
	return !_args.ColumnMappings.Values.Contains(columnName);
}

This was followed by updating and creating supporting positive and negative unit tests.  Once my changes were complete I was able to commit my fork and issue a pull request to the project owner where it was accepted several days later.

I am very excited to say the least.  The change I made to LinqToExcel is really not significant when taking in to account the library as a whole.  However, this was my first open source software contribution and it feels great to have been able to contribute something useful.  It is a true testament to the community of open source software that I was able to use this very helpful (and free) library and then contribute to it by solving a specific business need that I encountered.

Tags: , Filled Under: Programming Posted on: January 7, 2013