Hi Ryan.

> - Use template metaprogramming tricks to, given a type, expand all of
>   its constructor arguments into a list of numeric types.  So say we
>   had:
> 
>     Learner(double a, AuxType b)
>     AuxType(double c, double d)
> 
>   we would ideally want to extract [double, double, double] as our list
>   of types.  I can't quickly think of a strategy for this but it
>   *might* be possible...


Even if we are able to implement this approach, I guess the usage will be quite 
unintuitive. The implementation (if it is possible) will force a user to pass 
AuxType constuctor arguments into hyper parameter tuning module instead of 
AuxType objects themselves since we can’t extract constructor arguments from a 
created object.

> - Refactor all classes that take an auxiliary class to instead take a
>   template parameter pack to be unpacked into the auxiliary classes'
>   constructors.  This will still be a fair amount of metaprogramming
>   effort but I can see a closer route to a solution with this one.


For this implementation the usage should be more understandable for users (in 
this solution we provide a constructor of MLAlgorithm that takes arguments that 
we are going to pass in hyper-parameter tuning module), even though it is still 
quite complex (we need to pass AuxType constuctor arguments instead of AuxType 
objects themselves in the first place). But what are we going to do when 
AuxType is std::unordered_map<size_t, std::pair<size_t, size_t>>* or arma::mat 
(they appear in HoeffdingTree and LARS respectively)?

>> 3. In the case of hyper-parameter tuning  I guess a loss function
>> should be a wrap for a cross validation class (we want to optimize
>> performance on validation sets). But it is not clear what type of
>> interface it should provide: DecomposableFunctionType (like for SGD)
>> or FunctionType (like for SA or GradientDescent, all prerequisites for
>> which can potentially be combined in one class).
> 
> I'm not sure I fully follow here, can you clarify?

Existing optimizers in mlpack take as the first argument in their constructors 
a FunctionType object. There are different requirement what the FunctionType 
object should implement depending on the optimizer type. For instance, for SGD 
the FunctionType object should have the following method signature:

  double Evaluate(const arma::mat&, const size_t);

whereas for GradientDescent the FunctionType object should have this one:

  double Evaluate(const arma::mat&);

If I understand the whole discussion in the right way, we are ready to restrict 
ourself to optimise only numerical parameters in order to utilize the existing 
interface for optimisers. If so, I think it is quite possible to design a 
solution that allows the following usage.

  arma::mat data /* = ... */;
  arma::Row<size_t> labels /* = ... */;

  HyperParameterTuner<HoeffdingTree<>, Accuracy, KFoldCV>
      hoeffdingTreeTuner(data, labels);

  // Bound arguments
  data::DatasetInfo datasetInfo /* = … */;
  size_t numClasses = 5;
  bool batchTraining = false;
  size_t checkInterval = 100;

  // Setting sets of values to check
  arma::vec successProbabilities = arma::regspace(0.9, 0.01, 0.99);
  std::array<size_t, 2> maxSamplesSet = {0, 3};
  std::array<size_t, 3> minSamplesSet = {50, 100, 150};

  // Making variables for best parameters
  double successProbability;
  size_t maxSamples;
  size_t minSamples;

  // Finding best parameters
  auto bestParameters =
      hoeffdingTreeTuner.Optimize<GridSearch>(Bind(datasetInfo),
          Bind(numClasses), Bind(batchTraining), successProbabilities,
          maxSamplesSet, Bind(checkInterval), minSamplesSet);

  // Unpacking best parameters
  std::tie(successProbability, maxSamples, minSamples) = bestParameters;

In this example we mark the arguments datasetInfo, numClasses, batchTraining, 
and checkInterval as being bound (they should not be optimised). For other 
HoeffdingTree constructor arguments we provide sets of values to investigate. 
Note also that we pass arguments in the same order as for the corresponding 
HoeffdingTree constructor.

The GridSearch interface will be similar to other optimisers.

  template<typename FunctionType, typename...Collections>
  class GridSearch
  {
  public:
    GridSearch(FunctionType& function,
              const Collections& ...parameterCollections);

    double Optimize(arma::mat& iterate);
  };

A FunctionType function will be an instance of a cross validation wrapper class 
with approximately such interface.
 
  template<typename CV, typename...BoundedArgs, int TotalArgs>
  class CVFunction
  {
  public:
    CVFunction(CV& cv, const BoundedArgs& ...boundedArgs);

    double Evaluate(const arma::mat& parameters);
  };

During construction of a CVFunction object, we provide a cross validation 
object, a sequence of bounded arguments that should contain information about 
position in the argument list for the method Evaluate of the cross validation 
object, and total number of arguments that should be passed to the method 
Evaluate of the cross validation object.

With such design we can reuse GridSearch in other mlpack algorithms, as well as 
add support for other mlpack optimisers in a relatively simple way. For 
example, it should be relatively easy to add support for the GradientDescent 
optimiser with the following usage.

  HyperParameterTuner<SoftmaxRegression<>, Accuracy, KFoldCV>
      softmaxTuner(data, labels);

  // initial value for lambda
  double lambda = 0.001;

  // gradient descent parameters
  double stepSize = 0.001;
  size_t maxIterations = 20;

  double bestLambda;
  std::tie(bestLambda) =
      softmaxTuner.Optimize<GradientDescent>(Bind(numClasses), lambda,
          OptimizerArg(stepSize), OptimizerArg(maxIterations));

Let me know what you think about the proposed idea.

Best regards,

Kirill Mishchenko

> On 15 Apr 2017, at 01:29, Ryan Curtin <[email protected]> wrote:
> 
> On Mon, Apr 10, 2017 at 11:13:50AM +0500, Kirill Mishchenko wrote:
>> Hi Ryan,
>> 
>> I think I’m starting to see your perspective of how grid search
>> optimiser should be implemented. But some concerns remain.
> 
> Hi Kirill,
> 
> Sorry for the slow response.
> 
>> 1. Some information (precision) can be lost during conversions between
>> integer and floating-point values (e.g., during coding size_t value
>> into a cell of arma::mat). It is not very likely to happen in practice
>> (requiring very big values for integers), but it should be mentioned
>> anyway.
> 
> Agreed.  I think with an IEEE 754 double precision floating point number
> we get 2^54 possible values before loss of precision.
> 
>> 2. There are some other types of arguments in constructors for machine
>> learning algorithms (models) beside numeric types and
>> data::DatasetInfo. These include a template WeakLearnerType in
>> AdaBoost, templates CategoricalSplitType and NumericSplitType in
>> HoeffdingTree, std::unordered_map<size_t, std::pair<size_t, size_t>>*
>> in HoeffdingTree, arma::mat in LARS. Some non-numerical types of
>> arguments can also emerge in constructors of new machine learning
>> algorithms.
> 
> Yes, this is a little bit more difficult.  In most of these situations
> where a class instance is passed, it is usually so that the user can
> specify some of the numeric parameters to those class instances.  For
> instance the AdaBoost WeakLearnerType parameter is used to set the
> parameters of each weak learner that is built.
> 
> So I can see two possibilities although maybe there are more:
> 
> - Use template metaprogramming tricks to, given a type, expand all of
>   its constructor arguments into a list of numeric types.  So say we
>   had:
> 
>     Learner(double a, AuxType b)
>     AuxType(double c, double d)
> 
>   we would ideally want to extract [double, double, double] as our list
>   of types.  I can't quickly think of a strategy for this but it
>   *might* be possible...
> 
> - Refactor all classes that take an auxiliary class to instead take a
>   template parameter pack to be unpacked into the auxiliary classes'
>   constructors.  This will still be a fair amount of metaprogramming
>   effort but I can see a closer route to a solution with this one.
> 
> What do you think?  Do you have any additional ideas?  Note that I have
> not spent significant time thinking or playing with either of these
> ideas so I am nnot fully sure if they will work.
> 
>> 3. In the case of hyper-parameter tuning  I guess a loss function
>> should be a wrap for a cross validation class (we want to optimize
>> performance on validation sets). But it is not clear what type of
>> interface it should provide: DecomposableFunctionType (like for SGD)
>> or FunctionType (like for SA or GradientDescent, all prerequisites for
>> which can potentially be combined in one class).
> 
> I'm not sure I fully follow here, can you clarify?
> 
> Thanks,
> 
> Ryan
> 
> -- 
> Ryan Curtin    | "This room is green."
> [email protected] |   - Kazan

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Reply via email to